jaamarks / jaamarks_notebooks

Collection of various projects and procedures documented with Jupyter notebooks.
0 stars 0 forks source link

Jesse: CFAR + COGA QC & GWAS #8

Open jaamarks opened 4 years ago

jaamarks commented 4 years ago

See parent GitHub Issue 133.

CFAR dbGaP COGA dbGaP

QC these dbGaP studies and combine for GWAS and eventual inclusion in the HIV acquisition meta-analysis.

Age Distributions ## Age distributions for CFAR and COGA | CFAR | COGA | |----------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------| | ![image](https://user-images.githubusercontent.com/32715488/66494481-bd5c3f00-ea85-11e9-9eb1-1b16e2e6831d.png) | ![image](https://user-images.githubusercontent.com/32715488/66494554-d5cc5980-ea85-11e9-9672-27232ad33a0d.png) |
CFAR dbGaP Ternary Plot ![image](https://user-images.githubusercontent.com/32715488/77101636-01b51900-69ee-11ea-9f8a-4360b0e9e890.png)
COGA dbGaP Ternary Plot ![image](https://user-images.githubusercontent.com/32715488/77101519-d6322e80-69ed-11ea-9458-0da7f2b96b04.png)
jaamarks commented 4 years ago

Ternary Plots

CFAR unfiltered (N=4,761) ![afr_eas_eur_CFAR](https://user-images.githubusercontent.com/32715488/65993837-fd0ca080-e45f-11e9-8cfb-d6edc9390435.jpg)
CFAR: HAPMAP as reference Panel I also performed the Structure analysis on the CFAR subjects using a more homogeneous reference panel. In particular, I used a subset of the three superpopulations—AFR, EUR, and EAS—that we generally use. The subset populations I used as references were [YRI for AFR, CHB for EAS, and CEU for EUR](https://www.internationalgenome.org/category/population/). Here is the triangle plot for that analysis. ![image](https://user-images.githubusercontent.com/32715488/66408575-33e13a00-e9bd-11e9-979a-98f887b23054.png)
CFAR: include 1000G subjects in plot
I have included the 1,668 1KG subjects in the triangle plot below. The tight grouping at the vertices comes to no surprise however, since in Structure these were specified as the reference populations. ![image](https://user-images.githubusercontent.com/32715488/66407925-09db4800-e9bc-11e9-8316-d02902979a79.png) ****
COGA unfiltered (N=5,415) ![afr_eas_eur_COGA](https://user-images.githubusercontent.com/32715488/66136742-f7cb6500-e5c9-11e9-89d1-23a9c217867d.jpg)
CFAR+COGA EA
Here are the merged CFAR+COGA samples on a triangle plot generated from the STRUCTURE analysis. We took a 10K random sample (as per usual) from the post-QC genotype data. The cases (CFAR) are blue and the controls (COGA) are red. Both plots are the same except that the controls were plotted first on the top plot and the controls were plotted second on the bottom plot. ![afr_eas_eur_coga_highlighted_COGA_RED](https://user-images.githubusercontent.com/32715488/70631531-49602380-1bfb-11ea-9cb5-e44c080e1897.jpg) ![afr_eas_eur_cfar_highlighted_CFAR_BLUE](https://user-images.githubusercontent.com/32715488/70631532-49602380-1bfb-11ea-948e-50d6af5ac292.jpg)
jaamarks commented 4 years ago

We were a bit concerned that the CFAR subjects were shifted to the left. After double checking everything and reperforming the Structure analysis, it appears that this is correct though. We used the same code to process the COGA subjects and also the WIHS3 subjects and those data appear fine, so it must just be something inherent with the CFAR sample.

jaamarks commented 4 years ago

GWAS Results

Genotype PC Plots ![image](https://user-images.githubusercontent.com/32715488/70630870-4add1c00-1bfa-11ea-84ed-2549eba228c3.png) ![image](https://user-images.githubusercontent.com/32715488/70630827-3b5dd300-1bfa-11ea-8a0e-2e59636ef95b.png)
GWAS Plots: with top 3 PCs | ![cfar_cogo ea 1df 1000G hiv_acq maf_gt_0 01 rsq_gt_0 3 assoc plot all_chr snps+indels qq](https://user-images.githubusercontent.com/32715488/70630955-6b0cdb00-1bfa-11ea-9573-d7db75b9e735.png) | |--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | ![cfar_cogo ea 1df 1000G hiv_acq maf_gt_0 01 rsq_gt_0 3 assoc plot all_chr snps+indels manhattan](https://user-images.githubusercontent.com/32715488/70630956-6b0cdb00-1bfa-11ea-830c-4bd3419723ef.png) |
GWAS Plots: with top10 PCs | ![cfar_cogo ea 1df 1000G hiv_acq maf_gt_0 01 rsq_gt_0 3 assoc plot all_chr snps+indels qq](https://user-images.githubusercontent.com/32715488/70631080-9f809700-1bfa-11ea-93a7-87579e0deaa3.png) | |--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | ![cfar_cogo ea 1df 1000G hiv_acq maf_gt_0 01 rsq_gt_0 3 assoc plot all_chr snps+indels manhattan](https://user-images.githubusercontent.com/32715488/70631079-9ee80080-1bfa-11ea-9339-700abf711930.png) |
Genotyped SNPs only: RSQ (0.30) and MAF (0.01) filters
Only the genotyped SNPs that also passed the RSQ (0.30) and MAF (0.01) filters. ![cfar_coga ea 1df 1000G hiv_acq maf_gt_0 01 rsq_gt_0 3 assoc plot all_chr snps+indels manhattan](https://user-images.githubusercontent.com/32715488/70809013-f028f900-1d8e-11ea-92b5-e43ac6bddb35.png) ![cfar_coga ea 1df 1000G hiv_acq maf_gt_0 01 rsq_gt_0 3 assoc plot all_chr snps+indels qq](https://user-images.githubusercontent.com/32715488/70809014-f028f900-1d8e-11ea-9bff-0925425c23b7.png)
Genotyped SNPs only: RSQ (0.90) and MAF (0.01) filters
Only the genotyped SNPs that also passed the RSQ (0.90) and MAF (0.01) filters. ![cfar_coga ea 1df 1000G hiv_acq maf_gt_0 01 rsq_gt_0 9 assoc plot all_chr snps+indels manhattan](https://user-images.githubusercontent.com/32715488/71297288-e5eb9980-2350-11ea-8c4a-a17f5e630f54.png) ![cfar_coga ea 1df 1000G hiv_acq maf_gt_0 01 rsq_gt_0 9 assoc plot all_chr snps+indels qq](https://user-images.githubusercontent.com/32715488/71297290-e71cc680-2350-11ea-990d-0db9fd5584a8.png)


Top SNPs (p<0.001) from the original GWAS model: `HIV_ACQ~SNP+Sex+Alc_Dep+PC1+PC3+PC7` with the expected & observed heterozygote frequencies appended. [cfar_cogo.ea.1df.1000G_p3.HIV_ACQ~SNP+AGE+SEX+ALCOHOL+PC1+PC3+PC7.maf_gt_0.01_subject+eur.rsq_gt_0.30.p_lte_0.001.hwe_merged.xlsx](https://github.com/RTIInternational/jaamarks_notebooks/files/4027430/cfar_cogo.ea.1df.1000G_p3.HIV_ACQ.SNP%2BAGE%2BSEX%2BALCOHOL%2BPC1%2BPC3%2BPC7.maf_gt_0.01_subject%2Beur.rsq_gt_0.30.p_lte_0.001.hwe_merged.xlsx)
jaamarks commented 4 years ago

The results from three different Structure analyses.

Post-dbGaP afr_eas_eur_filtered_cfar_EA
Post-QC afr_eas_eur_CFAR
Imputed SNPs afr_eas_eur_CFAR
jaamarks commented 4 years ago

image

jaamarks commented 4 years ago

Revisit: QC CFAR+COGA together

jaamarks commented 4 years ago

QC COGA+CFAR combined

We are going to combine the CFAR and COGA genotype data and QC them together. Here are the results from the STRUCTURE analysis thus far.

Action Description Thresholding Criteria
For EA retainment (AFR < 25%)∧(EAS < 25%)
For AA retainment (AFR > 25%)∧(EAS < 25%)
For HA retainment (AFR < 25%)∧(EAS > 25%)


Ternary Plots

CFAR subjects are blue.

image

EUR Ancestry (N=7,461) ![image](https://user-images.githubusercontent.com/32715488/77954210-4f5a3d00-729c-11ea-8c57-eec6e13040d1.png)
AFR Ancestry (N=2,065) ![image](https://user-images.githubusercontent.com/32715488/77954170-3e113080-729c-11ea-9b58-f2315b70f83a.png)
AMR Ancestry (N=642) ![image](https://user-images.githubusercontent.com/32715488/77954193-49645c00-729c-11ea-94cb-83cc9a64cf1f.png)



## CFAR_COGA Genotype QC Summary Stats Below are the genotype summary stats for the merged CFAR & COGA data.

Initial Summary Stats | QC procedure | Variants removed | Variants retained | Subjects removed | Subjects retained | |--------------------------------------|------------------|-------------------|------------------|-------------------| | Initial CFAR dbGaP dataset | - | 581,817 | - | 4,761 | | Initial COGA dbGaP dataset | - | 581,036 | - | 5,415 | | Merge data | - | 611,069 | - | 10,176 | | Convert Markernames to rsid | 14,511 | 596,558 | - | - | | Duplicate rsID filtering | 24,062 | 572,496 | - | - | | Genome build 37 and dbGaP 138 update | 277 | 572,219 | - | - |
EUR Summary Stats Tables ### Autosomes This table includes autosome filtering statistics prior to merging with chrX. | QC procedure | Variants removed | Variants retained | Subjects removed | Subjects retained | |-------------------------------------------------|------------------|-------------------|------------------|-------------------| | STRUCTURE analysis (all chr) | - | 572,219 | 2,715 | 7,461 | | Partitioning to only autosomes | 14,112 | 558,107 | - | - | | Remove subjects missing whole autosome data | - | - | 0 | - | | Remove variants with missing call rate > 3% | 82,612 | 475,495 | - | - | | Remove variants with HWE p < 0.0001 | 3,832 | 471,663 | - | - | ### chrX This table includes chrX filtering statistics prior to merging with autosomes. | QC procedure | Variants removed | Variants retained | Subjects removed | Subjects retained | |------------------------------------------------|------------------|-------------------|------------------|-------------------| | Partitioning to only chrX | 558,107 | 14,112 | - | - | | Remove subjects missing whole chrX data | - | - | 0 | - | | Remove variants with missing call rate > 3% | 2,406 | 11,706 | - | - | | Remove variants with HWE p < 0.0001 | 15 | 11,691 | - | - | ### Merged | QC procedure | Variants removed | Variants retained | Subjects removed | Subjects retained | |----------------------------------------|------------------|-------------------|------------------|--------------------| | Excessive homozygosity filter | - | 483,354 | 0 | - | | Remove Subjects with IBS > 0.9 | - | - | 1,145 | 6,316 | | Remove Subjects with IBD > 0.4 | - | 471,663 | 1,408 | 4,908 | | Genotype Call Rate Subject Filter (3%) | - | - | 78 | 4,830 | | Sex discordance filter | - | 483,354 | 12 | 4,818 | ### Pre-imputation filtering | QC procedure | Variants removed | Variants retained | Subjects removed | Subjects retained | |----------------------------------------|------------------|-------------------|------------------|--------------------| | remove 1000G discordant alleles | 5,444 | 477,910 | 0 | - | | remove monomorphic variants | 10,750 | 467,160 | 0 | - | | remove individuals missing whole chr | - | 467,160 | 0 | 4,818 |
AFR Summary Stats Tables ### Autosomes This table includes autosome filtering statistics prior to merging with chrX. | QC procedure | Variants removed | Variants retained | Subjects removed | Subjects retained | |-------------------------------------------------|------------------|-------------------|------------------|-------------------| | STRUCTURE analysis (all chr) | - | 572,219 | 8,113 | 2,063 | | Partitioning to only autosomes | 14,112 | 558,107 | - | - | | Remove subjects missing whole autosome data | - | - | 0 | - | | Remove variants with missing call rate > 3% | 94,597 | 463,510 | - | - | | Remove variants with HWE p < 0.0001 | 5,052 | 458,458 | - | - | ### chrX This table includes chrX filtering statistics prior to merging with autosomes. | QC procedure | Variants removed | Variants retained | Subjects removed | Subjects retained | |------------------------------------------------|------------------|-------------------|------------------|-------------------| | Partitioning to only chrX | 572,219 | 14,112 | - | - | | Remove subjects missing whole chrX data | - | - | 0 | - | | Remove variants with missing call rate > 3% | 2,627 | 11,485 | - | - | | Remove variants with HWE p < 0.0001 | 44 | 11,441 | - | - | ### Merged | QC procedure | Variants removed | Variants retained | Subjects removed | Subjects retained | |----------------------------------------|------------------|-------------------|------------------|--------------------| | Merged | - | 469,899 | - | 2,063 | | Remove Subjects with IBS > 0.9 | - | - | 3 | 2,060 | | Remove Subjects with IBD > 0.4 | - | - | 32 | 2,028 | | Genotype Call Rate Subject Filter (3%) | - | - | 54 | 1,974 | | Sex discordance filter | - | - | 4 | 1,970 | | Excessive homozygosity filter | - | - | 0 | - | ### Pre-imputation filtering | QC procedure | Variants removed | Variants retained | Subjects removed | Subjects retained | |----------------------------------------|------------------|-------------------|------------------|--------------------| | remove 1000G discordant alleles | 5,384 | 464,515 | - | - | | remove monomorphic variants | 11,464 | 453,051 | - | - | | remove individuals missing whole chr | - | 453,051 | - | 1,970 |
AMR Summary Stats Tables ### Autosomes This table includes autosome filtering statistics prior to merging with chrX. | QC procedure | Variants removed | Variants retained | Subjects removed | Subjects retained | |-------------------------------------------------|------------------|-------------------|------------------|-------------------| | STRUCTURE analysis (all chr) | - | 572,219 | 9,534 | 642 | | Partitioning to only autosomes | 14,112 | 558,107 | - | - | | Remove subjects missing whole autosome data | - | - | 0 | - | | Remove variants with missing call rate > 3% | 84,711 | 473,396 | - | - | | Remove variants with HWE p < 0.0001 | 1,299 | 472,097 | - | - | ### chrX This table includes chrX filtering statistics prior to merging with autosomes. | QC procedure | Variants removed | Variants retained | Subjects removed | Subjects retained | |------------------------------------------------|------------------|-------------------|------------------|-------------------| | Partitioning to only chrX | 572,219 | 14,112 | - | - | | Remove subjects missing whole chrX data | - | - | 0 | - | | Remove variants with missing call rate > 3% | 2,371 | 11,741 | - | - | | Remove variants with HWE p < 0.0001 | 10 | 11,731 | - | - | ### Merged | QC procedure | Variants removed | Variants retained | Subjects removed | Subjects retained | |----------------------------------------|------------------|-------------------|------------------|--------------------| | Merged | - | 483,828 | - | 642 | | Remove Subjects with IBS > 0.9 | - | - | 47 | 595 | | Remove Subjects with IBD > 0.4 | - | - | 59 | 536 | | Genotype Call Rate Subject Filter (3%) | - | - | 8 | 528 | | Sex discordance filter | - | - | 0 | - | | Excessive homozygosity filter | - | - | 0 | - | ### Pre-imputation filtering | QC procedure | Variants removed | Variants retained | Subjects removed | Subjects retained | |----------------------------------------|------------------|-------------------|------------------|--------------------| | remove 1000G discordant alleles | 5,446 | 478,382 | - | - | | remove monomorphic variants | 20,635 | 457,747 | - | - | | remove individuals missing whole chr | - | 457,747 | - | 528 |



## Phenotype & Covariate Distributions
EUR | | N | |----------------|-------| | HIV case | 2,365 | | HIV control | 2,398 | | Alc_Dep case | 1,223 | | Alc_Dep contrl | 3,540 | | Male | 3,307 | | Female | 1,456 |

![cfar_coga_age_distributions](https://user-images.githubusercontent.com/32715488/77955330-fd1a1b80-729d-11ea-9767-e8adecf115b1.png)

![cfar_coga_hiv_sex_alcohol_distributions](https://user-images.githubusercontent.com/32715488/77955334-fe4b4880-729d-11ea-8701-f95dc769ec7e.png)
AFR N=1969 | | N | |----------------|-------| | HIV case | 1,832 | | HIV control | 137 | | Alc_Dep case | 495 | | Alc_Dep contrl | 1,474 | | Male | 1,338 | | Female | 631 |

![cfar_coga_afr_n1969_age_distribution](https://user-images.githubusercontent.com/32715488/79161996-7475a200-7daa-11ea-909a-1f4813ec1b77.png) ![cfar_coga_afr_n1969_hiv_sex_alcohol_distributions](https://user-images.githubusercontent.com/32715488/79162002-763f6580-7daa-11ea-90ea-9f08d7e94480.png)
AMR N=526 | | N | |----------------|-------| | HIV case | 414 | | HIV control | 112 | | Alc_Dep case | 169 | | Alc_Dep contrl | 357| | Male | 100 | | Female | 426 |

![cfar_coga_amr_n526_age_distribution](https://user-images.githubusercontent.com/32715488/79162026-82c3be00-7daa-11ea-8116-ccee25666a8a.png) ![cfar_coga_amr_n526_hiv_sex_alcohol_distributions](https://user-images.githubusercontent.com/32715488/79162028-82c3be00-7daa-11ea-8cab-c60206d03bda.png)

## EUR Genotype PCs
PCs 1-10 ![image](https://user-images.githubusercontent.com/32715488/77953897-cfcc6e00-729b-11ea-85e0-6e1289b41f41.png) ![image](https://user-images.githubusercontent.com/32715488/77953912-d6f37c00-729b-11ea-9aad-fe1212d90fe9.png)
Genotype PCs Explaining Phenotypic Variation ``` ================ EUR group ================ Top PCs: PC8 PC4 PC3 PC5 PVE: 80.2 ``` ![image](https://user-images.githubusercontent.com/32715488/77956281-a281bf00-729f-11ea-828a-e98e86a85d82.png) ![image](https://user-images.githubusercontent.com/32715488/77956321-af061780-729f-11ea-94bf-13af6f1973d0.png)
jaamarks commented 4 years ago

CFAR_COGA EUR HIV Acquisition GWAS (N=4,763)

PC3, PC4, PC5, PC8

Included as covariates: age, sex, alc_dep, PC8,PC4,PC3,PC5 (~80%) Performed with RVTESTS.

Click buttons to expand Manhattan and QQ plots.

RSQ 0.30 RSQ 0.80 RSQ 0.90
MAF 1%
:hole: cfar_coga eur 1000g hiv_acq rsq_0 30 maf_0 01 assoc plot snps+indels manhattan cfar_coga eur 1000g hiv_acq rsq_0 30 maf_0 01 assoc plot snps+indels qq
:hole: cfar_coga eur 1000g hiv_acq rsq_0 80 maf_0 01 assoc plot snps+indels manhattan cfar_coga eur 1000g hiv_acq rsq_0 80 maf_0 01 assoc plot snps+indels qq
:hole: cfar_coga eur 1000g hiv_acq rsq_0 90 maf_0 01 assoc plot snps+indels manhattan cfar_coga eur 1000g hiv_acq rsq_0 90 maf_0 01 assoc plot snps+indels qq
MAF 3%
:hole: cfar_coga eur 1000g hiv_acq rsq_0 30 maf_0 03 assoc plot snps+indels manhattan cfar_coga eur 1000g hiv_acq rsq_0 30 maf_0 03 assoc plot snps+indels qq
:hole: cfar_coga eur 1000g hiv_acq rsq_0 80 maf_0 03 assoc plot snps+indels manhattan cfar_coga eur 1000g hiv_acq rsq_0 80 maf_0 03 assoc plot snps+indels qq
:hole: cfar_coga eur 1000g hiv_acq rsq_0 90 maf_0 03 assoc plot snps+indels manhattan cfar_coga eur 1000g hiv_acq rsq_0 90 maf_0 03 assoc plot snps+indels qq
MAF 5%
:hole: cfar_coga eur 1000g hiv_acq rsq_0 30 maf_0 05 assoc plot snps+indels manhattan cfar_coga eur 1000g hiv_acq rsq_0 30 maf_0 05 assoc plot snps+indels qq
:hole: cfar_coga eur 1000g hiv_acq rsq_0 80 maf_0 05 assoc plot snps+indels manhattan cfar_coga eur 1000g hiv_acq rsq_0 80 maf_0 05 assoc plot snps+indels qq
:hole: cfar_coga eur 1000g hiv_acq rsq_0 90 maf_0 05 assoc plot snps+indels manhattan cfar_coga eur 1000g hiv_acq rsq_0 90 maf_0 05 assoc plot snps+indels qq

Tables P-value < 10e-4

click to expand [cfar_coga.eur.1000g_p3.hiv_acq.**maf_0.01.rsq_0.30**.p_lte_0.001.txt](https://github.com/RTIInternational/jaamarks_notebooks/files/4471471/cfar_coga.eur.1000g_p3.hiv_acq.maf_0.01.rsq_0.30.p_lte_0.001.txt) [cfar_coga.eur.1000g_p3.hiv_acq.**maf_0.01.rsq_0.80**.p_lte_0.001.txt](https://github.com/RTIInternational/jaamarks_notebooks/files/4471472/cfar_coga.eur.1000g_p3.hiv_acq.maf_0.01.rsq_0.80.p_lte_0.001.txt) [cfar_coga.eur.1000g_p3.hiv_acq.**maf_0.01.rsq_0.90**.p_lte_0.001.txt](https://github.com/RTIInternational/jaamarks_notebooks/files/4471473/cfar_coga.eur.1000g_p3.hiv_acq.maf_0.01.rsq_0.90.p_lte_0.001.txt) [cfar_coga.eur.1000g_p3.hiv_acq.**maf_0.03.rsq_0.30**.p_lte_0.001.txt](https://github.com/RTIInternational/jaamarks_notebooks/files/4549187/cfar_coga.eur.1000g_p3.hiv_acq.maf_0.03.rsq_0.30.p_lte_0.001.txt) [cfar_coga.eur.1000g_p3.hiv_acq.**maf_0.03.rsq_0.80**.p_lte_0.001.txt](https://github.com/RTIInternational/jaamarks_notebooks/files/4471475/cfar_coga.eur.1000g_p3.hiv_acq.maf_0.03.rsq_0.80.p_lte_0.001.txt) [cfar_coga.eur.1000g_p3.hiv_acq.**maf_0.03.rsq_0.90**.p_lte_0.001.txt](https://github.com/RTIInternational/jaamarks_notebooks/files/4511970/cfar_coga.eur.1000g_p3.hiv_acq.maf_0.03.rsq_0.90.p_lte_0.001.txt) [cfar_coga.eur.1000g_p3.hiv_acq.**maf_0.05.rsq_0.30**.p_lte_0.001.txt](https://github.com/RTIInternational/jaamarks_notebooks/files/4518409/cfar_coga.eur.1000g_p3.hiv_acq.maf_0.05.rsq_0.30.p_lte_0.001.txt) [cfar_coga.eur.1000g_p3.hiv_acq.**maf_0.05.rsq_0.80**.p_lte_0.001.txt](https://github.com/RTIInternational/jaamarks_notebooks/files/4518411/cfar_coga.eur.1000g_p3.hiv_acq.maf_0.05.rsq_0.80.p_lte_0.001.txt) [cfar_coga.eur.1000g_p3.hiv_acq.**maf_0.05.rsq_0.90**.p_lte_0.001.txt](https://github.com/RTIInternational/jaamarks_notebooks/files/4518413/cfar_coga.eur.1000g_p3.hiv_acq.**maf_0.05.rsq_0.90**.p_lte_0.001.txt)

### MAF filter applied separately for cases & controls
click to expand
MAF 1% (applied separately) and RSQ 0.30 ![cfar_coga eur 1000g hiv_acq rsq_0 30 maf_0 01_both assoc plot snps+indels manhattan](https://user-images.githubusercontent.com/32715488/79891209-abf9d500-83ce-11ea-92c2-5034f2159ab3.png) ![cfar_coga eur 1000g hiv_acq rsq_0 30 maf_0 01_both assoc plot snps+indels qq](https://user-images.githubusercontent.com/32715488/79891211-ac926b80-83ce-11ea-9017-c61a9b35d7f5.png)
MAF 1% (applied separately) and RSQ 0.80 ![cfar_coga eur 1000g hiv_acq rsq_0 80 maf_0 01_both assoc plot snps+indels manhattan](https://user-images.githubusercontent.com/32715488/79901158-64c71080-83dd-11ea-80e4-0c7b4a46a2b1.png) ![cfar_coga eur 1000g hiv_acq rsq_0 80 maf_0 01_both assoc plot snps+indels qq](https://user-images.githubusercontent.com/32715488/79901161-655fa700-83dd-11ea-9cc6-e3293fd56dc8.png)
MAF 1% (applied separately) and RSQ 0.90 ![cfar_coga eur 1000g hiv_acq rsq_0 90 maf_0 01_both assoc plot snps+indels manhattan](https://user-images.githubusercontent.com/32715488/79901123-57118b00-83dd-11ea-8a37-167584126423.png) ![cfar_coga eur 1000g hiv_acq rsq_0 90 maf_0 01_both assoc plot snps+indels qq](https://user-images.githubusercontent.com/32715488/79901125-57aa2180-83dd-11ea-9937-cb9da83e1fe2.png)
MAF 3% (applied separately) and RSQ 0.30 ![cfar_coga eur 1000g hiv_acq rsq_0 30 maf_0 03_both assoc plot snps+indels manhattan](https://user-images.githubusercontent.com/32715488/79901108-4e20b980-83dd-11ea-97f9-f5efa6a32a92.png) ![cfar_coga eur 1000g hiv_acq rsq_0 30 maf_0 03_both assoc plot snps+indels qq](https://user-images.githubusercontent.com/32715488/79901112-4eb95000-83dd-11ea-819b-c132f787661f.png)
MAF 3% (applied separately) and RSQ 0.80 ![cfar_coga eur 1000g hiv_acq rsq_0 80 maf_0 03_both assoc plot snps+indels manhattan](https://user-images.githubusercontent.com/32715488/79901085-44975180-83dd-11ea-8622-34cdb56ee3ca.png) ![cfar_coga eur 1000g hiv_acq rsq_0 80 maf_0 03_both assoc plot snps+indels qq](https://user-images.githubusercontent.com/32715488/79901087-452fe800-83dd-11ea-8c89-2daba73711a1.png)
MAF 3% (applied separately) and RSQ 0.90 ![cfar_coga eur 1000g hiv_acq rsq_0 90 maf_0 03_both assoc plot snps+indels manhattan](https://user-images.githubusercontent.com/32715488/79901048-334e4500-83dd-11ea-833a-4080a61cd966.png) ![cfar_coga eur 1000g hiv_acq rsq_0 90 maf_0 03_both assoc plot snps+indels qq](https://user-images.githubusercontent.com/32715488/79901049-33e6db80-83dd-11ea-9dac-906eb9f0278f.png)
#### Tables P-value < 10e-4
click to expand [cfar_coga.eur.1000g_p3.hiv_acq.maf_0.01_both.rsq_0.30.p_lte_0.001.txt](https://github.com/RTIInternational/jaamarks_notebooks/files/4511404/cfar_coga.eur.1000g_p3.hiv_acq.maf_0.01_both.rsq_0.30.p_lte_0.001.txt) [cfar_coga.eur.1000g_p3.hiv_acq.maf_0.01_both.rsq_0.80.p_lte_0.001.txt](https://github.com/RTIInternational/jaamarks_notebooks/files/4511405/cfar_coga.eur.1000g_p3.hiv_acq.maf_0.01_both.rsq_0.80.p_lte_0.001.txt) [cfar_coga.eur.1000g_p3.hiv_acq.maf_0.01_both.rsq_0.90.p_lte_0.001.txt](https://github.com/RTIInternational/jaamarks_notebooks/files/4511406/cfar_coga.eur.1000g_p3.hiv_acq.maf_0.01_both.rsq_0.90.p_lte_0.001.txt) [cfar_coga.eur.1000g_p3.hiv_acq.maf_0.03_both.rsq_0.30.p_lte_0.001.txt](https://github.com/RTIInternational/jaamarks_notebooks/files/4511407/cfar_coga.eur.1000g_p3.hiv_acq.maf_0.03_both.rsq_0.30.p_lte_0.001.txt) [cfar_coga.eur.1000g_p3.hiv_acq.maf_0.03_both.rsq_0.80.p_lte_0.001.txt](https://github.com/RTIInternational/jaamarks_notebooks/files/4511408/cfar_coga.eur.1000g_p3.hiv_acq.maf_0.03_both.rsq_0.80.p_lte_0.001.txt) [cfar_coga.eur.1000g_p3.hiv_acq.maf_0.03_both.rsq_0.90.p_lte_0.001.txt](https://github.com/RTIInternational/jaamarks_notebooks/files/4511409/cfar_coga.eur.1000g_p3.hiv_acq.maf_0.03_both.rsq_0.90.p_lte_0.001.txt)



## Top 10 PCs Included as covariates: age, sex, alc_dep, PC1–10

Click buttons to expand Manhattan and QQ plots. | | RSQ 0.30 | RSQ 0.80 | RSQ 0.90 | |------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | **MAF 1%** |
:hole: ![cfar_coga eur 1000g hiv_acq assoc plot snps+indels manhattan](https://user-images.githubusercontent.com/32715488/78082598-70e12480-7381-11ea-97ff-c67eabbbc6fe.png) ![cfar_coga eur 1000g hiv_acq assoc plot snps+indels qq](https://user-images.githubusercontent.com/32715488/78082601-7179bb00-7381-11ea-8255-bcfdb819decd.png)
|
:hole: ![cfar_coga eur 1000g hiv_acq rsq_0 80 maf_0 01 assoc plot snps+indels qq](https://user-images.githubusercontent.com/32715488/79030775-c16a3600-7b68-11ea-9f9d-0083db1b95d6.png) ![cfar_coga eur 1000g hiv_acq rsq_0 80 maf_0 03 assoc plot snps+indels manhattan](https://user-images.githubusercontent.com/32715488/79030776-c16a3600-7b68-11ea-96a3-a78f4e269f1b.png)
|
:hole: ![cfar_coga eur 1000g hiv_acq rsq_0 90 maf_0 01 assoc plot snps+indels manhattan](https://user-images.githubusercontent.com/32715488/79153398-c8c55580-7d9b-11ea-9840-f144e0d56994.png) ![cfar_coga eur 1000g hiv_acq rsq_0 90 maf_0 01 assoc plot snps+indels qq](https://user-images.githubusercontent.com/32715488/79153401-ca8f1900-7d9b-11ea-839b-016c91bd8226.png)
| | **MAF 3%** |
:hole: ![cfar_coga eur 1000g hiv_acq rsq_0 30 maf_0 03 assoc plot snps+indels manhattan](https://user-images.githubusercontent.com/32715488/79030778-ccbd6180-7b68-11ea-92a7-69c4dd3366fd.png) ![cfar_coga eur 1000g hiv_acq rsq_0 30 maf_0 03 assoc plot snps+indels qq](https://user-images.githubusercontent.com/32715488/79030779-cd55f800-7b68-11ea-9f4e-6a9a50d38737.png)
|
:hole: ![cfar_coga eur 1000g hiv_acq rsq_0 80 maf_0 03 assoc plot snps+indels manhattan](https://user-images.githubusercontent.com/32715488/78297640-f7227580-74fd-11ea-8ac4-9fe39dff6c4d.png) ![cfar_coga eur 1000g hiv_acq rsq_0 80 maf_0 03 assoc plot snps+indels qq](https://user-images.githubusercontent.com/32715488/78297645-fa1d6600-74fd-11ea-955d-21734a43620d.png)
|
:hole: ![cfar_coga eur 1000g hiv_acq rsq_0 90 maf_0 03 assoc plot snps+indels manhattan](https://user-images.githubusercontent.com/32715488/79913956-6cdd7b00-83f2-11ea-9e5c-958fbe9aa5c9.png) ![cfar_coga eur 1000g hiv_acq rsq_0 90 maf_0 03 assoc plot snps+indels qq](https://user-images.githubusercontent.com/32715488/79913959-6d761180-83f2-11ea-9710-63118c91634a.png)
| | **MAF 5%** |
:hole: ![cfar_coga eur 1000g hiv_acq rsq_0 30 maf_0 05 assoc plot snps+indels manhattan](https://user-images.githubusercontent.com/32715488/80021847-2860e680-84a9-11ea-9b85-c16e1299c8f3.png) ![cfar_coga eur 1000g hiv_acq rsq_0 30 maf_0 05 assoc plot snps+indels qq](https://user-images.githubusercontent.com/32715488/80021849-28f97d00-84a9-11ea-9aa0-6443ee1772d8.png)
|
:hole: ![cfar_coga eur 1000g hiv_acq rsq_0 80 maf_0 05 assoc plot snps+indels manhattan](https://user-images.githubusercontent.com/32715488/80021877-34e53f00-84a9-11ea-8506-5d2672ab9b76.png) ![cfar_coga eur 1000g hiv_acq rsq_0 80 maf_0 05 assoc plot snps+indels qq](https://user-images.githubusercontent.com/32715488/80021880-357dd580-84a9-11ea-8f41-8c75096ff8d0.png)
|
:hole: ![cfar_coga eur 1000g hiv_acq rsq_0 90 maf_0 05 assoc plot snps+indels manhattan](https://user-images.githubusercontent.com/32715488/80021911-42022e00-84a9-11ea-9710-51596b259ac5.png) ![cfar_coga eur 1000g hiv_acq rsq_0 90 maf_0 05 assoc plot snps+indels qq](https://user-images.githubusercontent.com/32715488/80021913-42022e00-84a9-11ea-8288-7b00c591ae63.png)
| #### Tables P-value < 10e-4
click to expand [cfar_coga.eur.1000g_p3.hiv_acq.**maf_0.01.rsq_0.30**.p_lte_0.001.txt](https://github.com/RTIInternational/jaamarks_notebooks/files/4471458/cfar_coga.eur.1000g_p3.hiv_acq.maf_0.01.rsq_0.30.p_lte_0.001.txt) [cfar_coga.eur.1000g_p3.hiv_acq.**maf_0.01.rsq_0.80**.p_lte_0.001.txt](https://github.com/RTIInternational/jaamarks_notebooks/files/4471459/cfar_coga.eur.1000g_p3.hiv_acq.maf_0.01.rsq_0.80.p_lte_0.001.txt) [cfar_coga.eur.1000g_p3.hiv_acq.**maf_0.01.rsq_0.90**.p_lte_0.001.txt](https://github.com/RTIInternational/jaamarks_notebooks/files/4471460/cfar_coga.eur.1000g_p3.hiv_acq.maf_0.01.rsq_0.90.p_lte_0.001.txt) [cfar_coga.eur.1000g_p3.hiv_acq.**maf_0.03.rsq_0.30**.p_lte_0.001.txt](https://github.com/RTIInternational/jaamarks_notebooks/files/4471461/cfar_coga.eur.1000g_p3.hiv_acq.maf_0.03.rsq_0.30.p_lte_0.001.txt) [cfar_coga.eur.1000g_p3.hiv_acq.**maf_0.03.rsq_0.80**.p_lte_0.001.txt](https://github.com/RTIInternational/jaamarks_notebooks/files/4471463/cfar_coga.eur.1000g_p3.hiv_acq.maf_0.03.rsq_0.80.p_lte_0.001.txt) [cfar_coga.eur.1000g_p3.hiv_acq.**maf_0.03.rsq_0.90**.p_lte_0.001.txt](https://github.com/RTIInternational/jaamarks_notebooks/files/4512618/cfar_coga.eur.1000g_p3.hiv_acq.maf_0.03.rsq_0.90.p_lte_0.001.txt) [cfar_coga.eur.1000g_p3.hiv_acq.**maf_0.05.rsq_0.30**.p_lte_0.001.txt](https://github.com/RTIInternational/jaamarks_notebooks/files/4518447/cfar_coga.eur.1000g_p3.hiv_acq.maf_0.05.rsq_0.30.p_lte_0.001.txt) [cfar_coga.eur.1000g_p3.hiv_acq.**maf_0.05.rsq_0.80**.p_lte_0.001.txt](https://github.com/RTIInternational/jaamarks_notebooks/files/4518449/cfar_coga.eur.1000g_p3.hiv_acq.maf_0.05.rsq_0.80.p_lte_0.001.txt) [cfar_coga.eur.1000g_p3.hiv_acq.**maf_0.05.rsq_0.90**.p_lte_0.001.txt](https://github.com/RTIInternational/jaamarks_notebooks/files/4518450/cfar_coga.eur.1000g_p3.hiv_acq.maf_0.05.rsq_0.90.p_lte_0.001.txt)

### MAF filter applied separately for cases & controls
click to expand
MAF 1% (applied separately) and RSQ 0.30 ![cfar_coga eur 1000g hiv_acq rsq_0 30 maf_0 01_both assoc plot snps+indels manhattan](https://user-images.githubusercontent.com/32715488/79913602-e3c64400-83f1-11ea-89b9-c16d5097a017.png) ![cfar_coga eur 1000g hiv_acq rsq_0 30 maf_0 01_both assoc plot snps+indels qq](https://user-images.githubusercontent.com/32715488/79913603-e45eda80-83f1-11ea-96ee-1c52efec2d36.png)
MAF 1% (applied separately) and RSQ 0.80 ![cfar_coga eur 1000g hiv_acq rsq_0 80 maf_0 01_both assoc plot snps+indels manhattan](https://user-images.githubusercontent.com/32715488/79913639-f3de2380-83f1-11ea-98c0-3dfc060a0aa6.png) ![cfar_coga eur 1000g hiv_acq rsq_0 80 maf_0 01_both assoc plot snps+indels qq](https://user-images.githubusercontent.com/32715488/79913641-f476ba00-83f1-11ea-83af-a309c6a8e908.png)
MAF 1% (applied separately) and RSQ 0.90 ![cfar_coga eur 1000g hiv_acq rsq_0 90 maf_0 01_both assoc plot snps+indels manhattan](https://user-images.githubusercontent.com/32715488/79913682-05273000-83f2-11ea-9e5f-dfa26a84ff1b.png) ![cfar_coga eur 1000g hiv_acq rsq_0 90 maf_0 01_both assoc plot snps+indels qq](https://user-images.githubusercontent.com/32715488/79913683-05bfc680-83f2-11ea-8952-8ef9fe132a22.png)
MAF 3% (applied separately) and RSQ 0.30 ![cfar_coga eur 1000g hiv_acq rsq_0 30 maf_0 03_both assoc plot snps+indels manhattan](https://user-images.githubusercontent.com/32715488/79913707-0fe1c500-83f2-11ea-8d68-049c5babe0ab.png) ![cfar_coga eur 1000g hiv_acq rsq_0 30 maf_0 03_both assoc plot snps+indels qq](https://user-images.githubusercontent.com/32715488/79913708-107a5b80-83f2-11ea-8f4a-00b4e8893c00.png)
MAF 3% (applied separately) and RSQ 0.80 ![cfar_coga eur 1000g hiv_acq rsq_0 80 maf_0 03_both assoc plot snps+indels manhattan](https://user-images.githubusercontent.com/32715488/79913742-1c661d80-83f2-11ea-86a4-9112026c641f.png) ![cfar_coga eur 1000g hiv_acq rsq_0 80 maf_0 03_both assoc plot snps+indels qq](https://user-images.githubusercontent.com/32715488/79913743-1cfeb400-83f2-11ea-8420-415e50132cf0.png)
MAF 3% (applied separately) and RSQ 0.90 ![cfar_coga eur 1000g hiv_acq rsq_0 90 maf_0 03_both assoc plot snps+indels manhattan](https://user-images.githubusercontent.com/32715488/79913763-24be5880-83f2-11ea-99c5-6ba453f9b2c3.png) ![cfar_coga eur 1000g hiv_acq rsq_0 90 maf_0 03_both assoc plot snps+indels qq](https://user-images.githubusercontent.com/32715488/79913766-2556ef00-83f2-11ea-9a2b-b55210578390.png)
#### Tables P-value < 10e-4
click to expand [cfar_coga.eur.1000g_p3.hiv_acq.maf_0.01_both.rsq_0.30.p_lte_0.001.txt](https://github.com/RTIInternational/jaamarks_notebooks/files/4512606/cfar_coga.eur.1000g_p3.hiv_acq.maf_0.01_both.rsq_0.30.p_lte_0.001.txt) [cfar_coga.eur.1000g_p3.hiv_acq.maf_0.01_both.rsq_0.80.p_lte_0.001.txt](https://github.com/RTIInternational/jaamarks_notebooks/files/4512607/cfar_coga.eur.1000g_p3.hiv_acq.maf_0.01_both.rsq_0.80.p_lte_0.001.txt) [cfar_coga.eur.1000g_p3.hiv_acq.maf_0.01_both.rsq_0.90.p_lte_0.001.txt](https://github.com/RTIInternational/jaamarks_notebooks/files/4512608/cfar_coga.eur.1000g_p3.hiv_acq.maf_0.01_both.rsq_0.90.p_lte_0.001.txt) [cfar_coga.eur.1000g_p3.hiv_acq.maf_0.03_both.rsq_0.30.p_lte_0.001.txt](https://github.com/RTIInternational/jaamarks_notebooks/files/4512609/cfar_coga.eur.1000g_p3.hiv_acq.maf_0.03_both.rsq_0.30.p_lte_0.001.txt) [cfar_coga.eur.1000g_p3.hiv_acq.maf_0.03_both.rsq_0.80.p_lte_0.001.txt](https://github.com/RTIInternational/jaamarks_notebooks/files/4512612/cfar_coga.eur.1000g_p3.hiv_acq.maf_0.03_both.rsq_0.80.p_lte_0.001.txt) [cfar_coga.eur.1000g_p3.hiv_acq.maf_0.03_both.rsq_0.90.p_lte_0.001.txt](https://github.com/RTIInternational/jaamarks_notebooks/files/4512613/cfar_coga.eur.1000g_p3.hiv_acq.maf_0.03_both.rsq_0.90.p_lte_0.001.txt)
jaamarks commented 4 years ago

CFAR_COGA HIV Acquisition GWAS (N=4,373)

Removed age outliers from COGA (24 < age < 86).

EUR Phenotype & Covariate Distributions

Click to Expand ![cfar_coga_n4374_age_distribution](https://user-images.githubusercontent.com/32715488/78722685-03d90c00-78f8-11ea-9d3a-c850001c4cd9.png)
![cfar_coga_n4374_hiv_sex_alcohol_distributions](https://user-images.githubusercontent.com/32715488/78722721-105d6480-78f8-11ea-986d-80f1545ed815.png)

``` ================ EUR group ================ Top PCs: PC3 PC4 PC1 PVE: 75.44 ``` ![cfar_coga_n4374_phenotype_variance_explained_by_genotype_pcs_sorted](https://user-images.githubusercontent.com/32715488/78722722-10f5fb00-78f8-11ea-938e-ac0d96e891ea.png)

### top 3 PCs Included as covariates: age, sex, alc_dep, PC1,PC3,PC4
RSQ 0.30 MAF 1% ![cfar_coga eur 1000g hiv_acq rsq_0 30 maf_0 01 assoc plot snps+indels manhattan](https://user-images.githubusercontent.com/32715488/78722379-59f97f80-78f7-11ea-98a7-9a4ce0ca4950.png) ![cfar_coga eur 1000g hiv_acq rsq_0 30 maf_0 01 assoc plot snps+indels qq](https://user-images.githubusercontent.com/32715488/78722382-5a921600-78f7-11ea-83d6-62da29dca9a0.png)
RSQ 0.80 MAF 3% ![cfar_coga eur 1000g hiv_acq rsq_0 80 maf_0 03 assoc plot snps+indels manhattan](https://user-images.githubusercontent.com/32715488/78722442-7d242f00-78f7-11ea-9f96-402df2402bff.png) ![cfar_coga eur 1000g hiv_acq rsq_0 80 maf_0 03 assoc plot snps+indels qq](https://user-images.githubusercontent.com/32715488/78722444-7d242f00-78f7-11ea-89c4-c92962540e71.png)

### top 10 PCs Included as covariates: age, sex, alc_dep, PC1–10
RSQ 0.30 MAF 1% ![cfar_coga eur 1000g hiv_acq rsq_0 30 maf_0 01 assoc plot snps+indels manhattan](https://user-images.githubusercontent.com/32715488/78722538-b197eb00-78f7-11ea-8065-b43e1541354e.png) ![cfar_coga eur 1000g hiv_acq rsq_0 30 maf_0 01 assoc plot snps+indels qq](https://user-images.githubusercontent.com/32715488/78722539-b2308180-78f7-11ea-8dd9-0b70f8518961.png)
RSQ 0.80 MAF 3% ![cfar_coga eur 1000g hiv_acq rsq_0 80 maf_0 03 assoc plot snps+indels manhattan](https://user-images.githubusercontent.com/32715488/78722551-b8266280-78f7-11ea-9a9c-d813414d90d2.png) ![cfar_coga eur 1000g hiv_acq rsq_0 80 maf_0 03 assoc plot snps+indels qq](https://user-images.githubusercontent.com/32715488/78722553-b8bef900-78f7-11ea-9dd0-b2e557c21c7c.png)
jaamarks commented 4 years ago

Details about chr23 & chr6 top hits

``` jmarks@RTI-103356 ~/Projects/hiv/cfar_coga/gwas/0001 awk '($2==23 && $NF< 10E-20) {print $NF}' cfar_coga.eur.1000g_p3.hiv_acq.maf_0.03.rsq_0.80.p_lte_0.001.txt awk '($2==6 && $NF< 5.61e-8) {print $0}' cfar_coga.eur.1000g_p3.hiv_acq.maf_0.03.rsq_0.80.p_lte_0.001.txt ```


| ID | CHROM | POS | REF | ALT | N_INFORMATIVE | AF | INFORMATIVE_ALT_AC | CALL_RATE | HWE_PVALUE | N_REF | N_HET | N_ALT | U_STAT | SQRT_V_STAT | ALT_EFFSIZE | PVALUE | |-------------------------|-------|----------|-----|-----|---------------|------------------------------|------------------------|-----------|---------------------------|----------------|------------|-------|----------|-------------|-------------|-------------| | rs76422484:62817386:C:T | 23 | 62817386 | C | T | 4756 | 0.106378:0.158198:0.00995458 | 703.581:680.566:23.015 | 1:1:1 | 4.67019e-20:9.20735e-31:1 | 2578:1446:1132 | 729:705:24 | 0:0:0 | 193.219 | 12.2948 | 1.27823 | 1.18365e-55 | | rs78475991:62880223:A:C | 23 | 62880223 | A | C | 4756 | 0.10861:0.161415:0.0103525 | 718.344:694.409:23.935 | 1:1:1 | 4.67019e-20:9.20735e-31:1 | 2578:1446:1132 | 729:705:24 | 0:0:0 | 196.814 | 12.6192 | 1.23593 | 7.70042e-55 | | rs9400531:112656072:G:C | 6 | 112656072 | G | C | 4763 | 0.180376:0.196583:0.164392 | 1718.26:929.839:788.422 | 1:1:1 | 0.407277:0.400121:0.940937 | 3200:1528:1672 | 1398:738:660 | 165:99:66 | 83.2262 | 15.3245 | 0.354397 | 5.60578e-08 | | rs9384958:116112728:A:G | 6 | 116112728 | A | G | 4763 | 0.20753:0.2334:0.182016 | 1976.93:1103.98:872.949 | 1:1:1 | 0.223407:0.42705:0.588751 | 2977:1381:1596 | 1559:843:716 | 227:141:86 | 86.9972 | 16.0186 | 0.339043 | 5.60383e-08 | | rs6926556:116118455:T:C | 6 | 116118455 | T | C | 4763 | 0.214763:0.244088:0.185842 | 2045.83:1154.54:891.296 | 1:1:1 | 0.300898:0.575892:0.636439 | 2956:1362:1594 | 1577:859:718 | 230:144:86 | 97.7634 | 16.4618 | 0.360763 | 2.87105e-09 | Note that the allele frequency for controls (COGA) in chr23 SNPs is near zero. Should we apply the MAF filter individually for cases and controls or overall like we currently do?
jaamarks commented 4 years ago
chrom name McLaren_beta McLaren_P CFAR_COGA_beta CFAR_COGA_P
6 rs12210050:475489:C:T 0.2140599325811672 4.847e-09 -0.157585 0.0159497
6 rs41561016:31322611:C:T -0.41144717978571177 9.459e-09 -0.0396366 0.750087
6 rs41557415:31323455:A:G 0.4123386770513366 9.424e-09 -0.0400005 0.74785
6 rs1140487:31322987:C:T -0.412109650826833 9.457e-09 -0.0400005 0.74785
6 rs41543314:31322690:A:G 0.4028684822608984 2.332e-08 -0.0839558 0.507325
jaamarks commented 4 years ago

Verifying the coding for both McLaren and CFAR_COGA.

McLaren (original) | CHR | SNP | BP | A1 | A2 | OR | P | |-----|------------|----------|----|----|--------|-----------| | 6 | rs12210050 | 475489 | T | C | 0.8073 | 4.847e-09 | | 6 | rs41561016 | 31322611 | T | C | 1.509 | 9.459e-09 | | 6 | rs41557415 | 31323455 | A | G | 0.6621 | 9.424e-09 | | 6 | rs1140487 | 31322987 | T | C | 1.51 | 9.457e-09 | | 6 | rs41543314 | 31322690 | A | G | 0.6684 | 2.332e-08 |
MCLAREN (converted) | chrom | name | position | REF | ALT | ALT_EFFSIZE | p | |-------|----------------------------------|-----------|-----|-----|----------------------|-----------| | 6 | rs12210050:475489:C:T | 475489 | T | C | 0.2140599325811672 | 4.847e-09 | | 6 | rs41561016:31322611:C:T | 31322611 | T | C | -0.41144717978571177 | 9.459e-09 | | 6 | rs41557415:31323455:A:G | 31323455 | A | G | 0.4123386770513366 | 9.424e-09 | | 6 | rs1140487:31322987:C:T | 31322987 | T | C | -0.412109650826833 | 9.457e-09 | | 6 | rs41543314:31322690:A:G | 31322690 | A | G | 0.4028684822608984 | 2.332e-08 |

CFAR_COGA

ID CHROM POS REF ALT ALT_EFFSIZE PVALUE
rs12210050:475489:C:T 6 475489 C T -0.157585 0.0159497
rs41561016:31322611:C:T 6 31322611 C T -0.0396366 0.750087
rs41557415:31323455:A:G 6 31323455 A G -0.0400005 0.74785
rs1140487:31322987:C:T 6 31322987 C T -0.0400005 0.74785
rs41543314:31322690:A:G 6 31322690 A G -0.0839558 0.507325
jaamarks commented 4 years ago

GWAS results of top SNPs on chr1 and 19

jmarks@RTI-103356 ~/Projects/hiv/cfar_coga/gwas/0001/maf_both
$ awk '$17 < 1e-15' cfar_coga.eur.1000g_p3.hiv_acq.maf_0.03_both.rsq_0.90.p_lte_0.001.txt
ID CHROM POS REF ALT N_INFORMATIVE AF INFORMATIVE_ALT_AC CALL_RATE HWE_PVALUE N_REF N_HET N_ALT U_STAT SQRT_V_STAT ALT_EFFSIZE PVALUE
rs10911132:182753673:G:A 1 182753673 G A 4763 0.0975693:0.141029:0.0547075 929.445:667.068:262.377 1:1:1 0.75388:0.00376253:0.117543 3829:1688:2141 886:641:245 48:36:12 125.07 11.5818 0.932393 3.48702e-27
rs10911133:182753838:G:T 1 182753838 G T 4763 0.102479:0.149187:0.0564141 976.216:705.654:270.562 1:1:1 0.587918:0.00149029:0.117543 3816:1675:2141 899:654:245 48:36:12 134.85 12.0782 0.924377 6.0628e-29
rs1064257:49993535:C:G 19 49993535 C G 4763 0.0874761:0.118845:0.0565384 833.297:562.139:271.158 1:1:1 0.000413457:0.00120084:0.00595059 3963:1833:2130 783:516:267 17:16:1 109.352 10.96 0.910348 1.91459e-23



Dose info file

SNP REF(0) ALT(1) ALT_Frq MAF AvgCall Rsq Genotyped LooRsq EmpR EmpRsq Dose0 Dose1
1:182753838:G:T G T 0.10209 0.10209 0.99400 0.94751 Genotyped 0.540 0.637 0.40571 0.45810 0.03929
1:182753673:G:A G A 0.09718 0.09718 0.99015 0.91021 Imputed - - - - -
19:49993535:C:G C G 0.08712 0.08712 0.99541 0.95675 Genotyped 0.760 0.654 0.42775 0.50306 0.02054


EmpR, EmpRsq

While the LooRsq statistic completely ignores experimental genotypes, EmpR is calculated by calculating the correlation between the true genotyped values and the imputed dosages that were calculated by hiding all known genotyped for the given SNP (see LooDosage). A negative correlation between imputed and experimental genotypes can indicate allele flips. This statistic also can only be provided for genotyped sites. EmpRsq is the square of this correlation.