jaamarks / jaamarks_notebooks

Collection of various projects and procedures documented with Jupyter notebooks.
0 stars 0 forks source link

HIV Acquisition Meta-Analysis #12

Open jaamarks opened 4 years ago

jaamarks commented 4 years ago

Performing meta-analyses HIV Acquisition.

The parent issue is GitHub Issue 97.

Analysis Description

hiv-acquisition-gwas-meta-analysis-description.xlsx

jaamarks commented 4 years ago

We recently added WIHS3 to the latest iteration of metas. These can be found on S3 at: s3://rti-hiv/meta_new/{023–027}

jaamarks commented 4 years ago

WIHS3_AA GWAS (N=2,009)

image image
jaamarks commented 4 years ago

023: cross-ancestry (N=12,617)

meta015 + WIHS3_AA:

Previous Current
image image
image image
jaamarks commented 4 years ago

024: AFR (N=7,597)

meta017 + WIHS3_AFR:

Previous Current
image image
image image
jaamarks commented 4 years ago

025: cross-ancestry (N=12,261)

meta018 + WIHS3_AA:

Previous Current
image image
image image
jaamarks commented 4 years ago

026: cross-ancestry (N=26,198)

meta021 + WIHS3_AA:

Previous Current
image image
image image
jaamarks commented 4 years ago

027: cross-ancestry (N=25,842)

meta026 - WIHS1_HA:

image

image

jaamarks commented 4 years ago

Rerun using cohort-level data

We reran metas 023–027 except we used the individual cohort-level GWAS summary stats instead of the previous meta-analyses.

023 AFR+AMR+EUR (N=12,617)

~ meta015 + WIHS3_AA (see s3://rti-hiv/meta_new/015/results/figures/ for previous results)

image image



024 AFR-specific (N=7,597)

~meta017 + WIHS3_AA (see s3://rti-hiv/meta_new/017/results/figures/ for previous results)

image image



025 AFR+EUR (N=12,261)

~meta018 + WIHS3_AA (see s3://rti-hiv/meta_new/018/results/figures for previous results)

image image



026 AFR+AMR+EUR (N=26,198)

~meta021 + WIHS3_AA (see s3://rti-hiv/meta_new/021/results/v03/figures for previous results)

image image



027 AFR+EUR (N=25,842)

~meta022 + WIHS3_AA (see s3://rti-hiv/meta_new/022/results/v05/figures/ for previous results)

image image



These results have been stored on S3 at: s3://rti-hiv/meta_new/{023–027}/results/v02/

jaamarks commented 4 years ago

HIV Acquisition Meta-Analysis with TOPMed Imputed Data

0028 AFR+AMR+EUR (N=12,617)

1000g_p3 results (meta 0023) ![hiv_acq_cross_1df_meta_023 snps+indels manhattan](https://user-images.githubusercontent.com/32715488/94014445-87d56200-fd79-11ea-8765-b6505d0f2c6a.png) ![hiv_acq_cross_1df_meta_023 snps+indels qq](https://user-images.githubusercontent.com/32715488/94014449-89068f00-fd79-11ea-90ff-718f503afc98.png)
TOPMed results (meta 0028)
no GC ![hiv_acquisition_gwas_meta_cross snps+indels manhattan](https://user-images.githubusercontent.com/32715488/94025448-7e9ec200-fd86-11ea-9324-34ece3f10fef.png)

![hiv_acquisition_gwas_meta_cross snps+indels qq](https://user-images.githubusercontent.com/32715488/94025452-7f375880-fd86-11ea-8499-3aae5032a881.png)
GC applied to each ![hiv_acquisition_gwas_meta_cross snps+indels manhattan](https://user-images.githubusercontent.com/32715488/94031557-fcfe6280-fd8c-11ea-882b-9c2427f947a1.png)

![hiv_acquisition_gwas_meta_cross snps+indels qq](https://user-images.githubusercontent.com/32715488/94031560-fe2f8f80-fd8c-11ea-8a69-02398518eab1.png)
jaamarks commented 4 years ago

Troubleshooting

details * I reran the previous 1000g_p3 meta-023 and received consistent results. The only caveat is that I noticed that the previous set included WIHS2_EUR, which it shouldn't have. Therefore the results are nearly the same, but not exactly. However, the overall goal was to see if we could replicate the previous results to make sure that our pipeline was not defective. We showed that the pipeline is not defective and therefore we can continue our troubleshooting. To be most consistent, we should run each 1000g_p3 cohort through the new GWAS pipeline.
- [X] Apply genomic control to all cohorts.
details This did not tame the inflation.
| ![hiv_acquisition_gwas_meta_cross snps+indels qq](https://user-images.githubusercontent.com/32715488/94031560-fe2f8f80-fd8c-11ea-8a69-02398518eab1.png) | ![hiv_acquisition_gwas_meta_cross snps+indels manhattan](https://user-images.githubusercontent.com/32715488/94031557-fcfe6280-fd8c-11ea-882b-9c2427f947a1.png) | |---------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------| | | |

- [x] Confirm correct GWAS results were being used.
details I redownloaded each set of TOPMed GWAS results from S3 to double check that we were using the correct results. **This checks out.** Also, confirm GWAS results in the HIV bucket were correctly copied from the associated Cromwell output.When each GWAS is complete, the results are copied over from the Cromwell bucket to the appropriate project bucket in the S3 organizational structure. We should verify that the GWAS results in the HIV bucket were copied over correctly. **This checks out.**

- [X] Verify meta-analysis pipeline is correct by replicating old results
details We needed to verify that the meta-analysis pipeline (bash script) we have been using is correct. We attempted to replicate the previous 1000g_p3 HIV acquisition GWAS meta-analysis results. The results were consistent, there for the pipeline is correct.

- [x] Try stricter thresholds (rsq>0.8 and maf>0.05)
details During our lab meeting on Wednesday (9/23/2020) we discussed applying a stricter rsq threshold. It was suggested that since we haven't had much experience with TOPMed imputed data, the same filters we apply to 1000g_p3 data might not be directly applicable. We will therefore experiment with applying a stricter rsq threshold. In particular, we will use the GWAS results with rsq > 0.8 applied. * rsq 0.3 & MAF 0.01, without GC: **lambda=1.243** * rsq 0.3 & MAF 0.01, with GC: **lambda=1.203** * rsq 0.8 & MAF 0.01, without GC: **lambda=1.242** * rsq 0.8 & MAF 0.01, with GC: **lambda=1.204** * rsq 0.8 & MAF 0.05, without GC: **lambda=1.238** * rsq 0.8 & MAF 0.05, with GC: **lambda=1.20**

- [x] Apply genomic control twice
details >We recommend applying genomic control correction to all input files that include genomewide data and, in addition, to the meta-analysis results. ([METAL documentation](https://genome.sph.umich.edu/wiki/METAL_Documentation#Genomic_Control_Correction)).
![hiv_acquisition_gwas_meta_cross snps+indels manhattan](https://user-images.githubusercontent.com/32715488/95215329-dd206300-07be-11eb-9e47-bd578339d018.png) ![hiv_acquisition_gwas_meta_cross snps+indels qq](https://user-images.githubusercontent.com/32715488/95215335-ddb8f980-07be-11eb-9283-abaa0ae2e7b1.png)

- [x] Inspect gc parameter for each chromosome across each cohort
details ![image](https://user-images.githubusercontent.com/32715488/95206947-4d29eb80-07b5-11eb-90ed-3cce3be5dedd.png) | | chr1 | chr2 | chr3 | chr4 | chr5 | chr6 | chr7 | chr8 | chr9 | chr10 | chr11 | chr12 | chr13 | chr14 | chr15 | chr16 | chr17 | chr18 | chr19 | chr20 | chr21 | chr22 | chr23 | |------------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------| | **UHS1–4 AFR** | 1.071 | 1.023 | 1.048 | 1.023 | 0.995 | 1.002 | 1.041 | 1.047 | 1.043 | 1.071 | 1.045 | 0.974 | 0.97 | 1.002 | 1.033 | 1.025 | 1.06 | 1.054 | 1.173 | 1.031 | 1.012 | 1.019 | 1.085 | | **UHS1–4 EUR** | 1.038 | 1.108 | 1.039 | 1.021 | 1.094 | 1.165 | 1.089 | 1.04 | 1.084 | 1.054 | 1.09 | 1.033 | 1.072 | 1.056 | 1.093 | 1.017 | 1.011 | 1.099 | 1.151 | 1.186 | 1.034 | 1.108 | 1.02 | | **VIDUS EUR** | 1.068 | 0.97 | 1.028 | 1.026 | 0.962 | 1.027 | 1.028 | 1.009 | 0.963 | 0.983 | 1.083 | 1.133 | 1.054 | 1.048 | 1.126 | 1.026 | 1.107 | 1.001 | 0.862 | 1.023 | 1.14 | 0.938 | 0.952 | | **WIHS1 AFR** | 1.002 | 0.943 | 1.032 | 0.987 | 0.958 | 0.988 | 0.964 | 1.007 | 0.981 | 1.009 | 0.992 | 0.975 | 1.078 | 0.975 | 1.078 | 0.948 | 0.979 | 0.933 | 0.985 | 0.98 | 1 | 0.973 | 1.006 | | **WIHS1 AMR** | 1.039 | 1.065 | 1.042 | 1.021 | 1.026 | 1.038 | 1.014 | 0.994 | 1.005 | 0.986 | 1.049 | 1.029 | 1.046 | 1.015 | 0.989 | 1.078 | 1.073 | 1.024 | 0.981 | 0.998 | 0.976 | 1.112 | 1.201 | | **WIHS1 EUR** | 0.975 | 1.047 | 1.042 | 1.081 | 0.963 | 1.03 | 1.071 | 0.95 | 0.986 | 0.974 | 0.989 | 1.014 | 1.127 | 0.962 | 1.032 | 0.973 | 1.039 | 0.999 | 0.943 | 1.029 | 0.881 | 1.036 | 1 | | **WIHS2 AFR** | 1.017 | 1.033 | 1.031 | 1.001 | 1.045 | 0.965 | 0.99 | 1.031 | 0.977 | 1.038 | 1.028 | 1.072 | 1.116 | 1.041 | 1.006 | 0.983 | 0.978 | 1.026 | 0.985 | 0.978 | 0.924 | 1.022 | 1.115 | | **WIHS3 AFR** | 1.012 | 1.05 | 0.981 | 0.996 | 1.02 | 0.97 | 0.969 | 1.024 | 0.969 | 1.032 | 1.066 | 1.04 | 1.116 | 1.001 | 0.979 | 0.98 | 0.978 | 1.007 | 0.999 | 1.005 | 0.983 | 0.999 | 1.117 |

**After GC was applied to each cohort/chromosome.** ![image](https://user-images.githubusercontent.com/32715488/95214695-28864180-07be-11eb-9e8f-baf7e95a2cdc.png) | | chr1 | chr2 | chr3 | chr4 | chr5 | chr6 | chr7 | chr8 | chr9 | chr10 | chr11 | chr12 | chr13 | chr14 | chr15 | chr16 | chr17 | chr18 | chr19 | chr20 | chr21 | chr22 | chr23 | |----|-------|-------|------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------| | GC | 1.193 | 1.215 | 1.22 | 1.224 | 1.187 | 1.171 | 1.198 | 1.194 | 1.176 | 1.167 | 1.131 | 1.204 | 1.183 | 1.193 | 1.232 | 1.189 | 1.234 | 1.19 | 1.329 | 1.21 | 1.16 | 1.214 | 1.272 |

- [x] QQ plot for each chromosome
details
No GC
![image](https://user-images.githubusercontent.com/32715488/95224450-aea78580-07c8-11eb-9c6b-b5e93a2e0d77.png)
| | chr1 | chr2 | chr3 | chr4 | chr5 | chr6 | chr7 | chr8 | chr9 | chr10 | chr11 | chr12 | chr13 | chr14 | chr15 | chr16 | chr17 | chr18 | chr19 | chr20 | chr21 | chr22 | chr23 | |--------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------| | **lambda** | 1.229 | 1.259 | 1.255 | 1.238 | 1.217 | 1.198 | 1.227 | 1.229 | 1.198 | 1.211 | 1.18 | 1.249 | 1.277 | 1.216 | 1.278 | 1.201 | 1.269 | 1.225 | 1.392 | 1.248 | 1.182 | 1.243 | 1.373 |
qq plots ![chr1_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95221671-b4e83280-07c5-11eb-8046-a92cd3f54bca.png) ![chr2_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95221674-b580c900-07c5-11eb-96a8-e8c276d93401.png) ![chr3_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95221678-b580c900-07c5-11eb-8089-4a7d68209a94.png) ![chr4_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95221679-b580c900-07c5-11eb-8208-cdf41758b4f4.png) ![chr5_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95221680-b6195f80-07c5-11eb-99c5-657424a42f57.png) ![chr6_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95221682-b6195f80-07c5-11eb-98cb-c5104231e517.png) ![chr7_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95221683-b6195f80-07c5-11eb-9a48-16858ed68452.png) ![chr8_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95221686-b6b1f600-07c5-11eb-95f3-feb3f00067b4.png) ![chr9_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95221690-b6b1f600-07c5-11eb-8a1b-485253489ced.png) ![chr10_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95221691-b74a8c80-07c5-11eb-8fbf-c842f8a70fcc.png) ![chr11_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95221692-b74a8c80-07c5-11eb-8124-2f5b50a9b4a8.png) ![chr12_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95221694-b74a8c80-07c5-11eb-884e-9372cf1d8e25.png) ![chr13_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95221697-b74a8c80-07c5-11eb-9aec-a38924595604.png) ![chr14_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95221700-b7e32300-07c5-11eb-9aba-d0f6c473fd72.png) ![chr15_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95221701-b7e32300-07c5-11eb-846c-16befcd11aab.png) ![chr16_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95221703-b7e32300-07c5-11eb-8144-025348ea32b6.png) ![chr17_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95221705-b87bb980-07c5-11eb-8b9a-797acb6472c0.png) ![chr18_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95221707-b87bb980-07c5-11eb-9048-39d28c1ce449.png) ![chr19_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95221708-b87bb980-07c5-11eb-8862-935146a8d315.png) ![chr20_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95221709-b9145000-07c5-11eb-909f-1efe71265897.png) ![chr21_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95221712-b9145000-07c5-11eb-8c43-d1c4a76afc87.png) ![chr22_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95221713-b9145000-07c5-11eb-828f-8da389d0fec7.png) ![chr23_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95221716-b9ace680-07c5-11eb-9631-8960b34c34c6.png)

GC applied to each cohort
![image](https://user-images.githubusercontent.com/32715488/95224539-c67f0980-07c8-11eb-8512-ac79ee1d64c2.png)
| | chr1 | chr2 | chr3 | chr4 | chr5 | chr6 | chr7 | chr8 | chr9 | chr10 | chr11 | chr12 | chr13 | chr14 | chr15 | chr16 | chr17 | chr18 | chr19 | chr20 | chr21 | chr22 | chr23 | |--------|-------|-------|------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------| | **lambda** | 1.194 | 1.214 | 1.22 | 1.223 | 1.187 | 1.171 | 1.198 | 1.195 | 1.176 | 1.167 | 1.131 | 1.204 | 1.183 | 1.194 | 1.232 | 1.189 | 1.235 | 1.189 | 1.329 | 1.21 | 1.16 | 1.214 | 1.272 |
qq plots ![chr1_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95223671-db0ed200-07c7-11eb-924a-09b2961333e4.png) ![chr2_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95223677-dba76880-07c7-11eb-9c80-3016ef1eb8b8.png) ![chr3_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95223679-dc3fff00-07c7-11eb-81ed-fe2f5d7ad36a.png) ![chr4_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95223680-dc3fff00-07c7-11eb-9db1-086a7f629c70.png) ![chr5_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95223681-dc3fff00-07c7-11eb-9b2e-1a5c28f4a0f9.png) ![chr6_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95223682-dcd89580-07c7-11eb-9b16-7ce77e081567.png) ![chr7_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95223684-dcd89580-07c7-11eb-9f76-fd2a049fdc9a.png) ![chr8_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95223687-dd712c00-07c7-11eb-85b2-8f1a0ae30ad0.png) ![chr9_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95223688-de09c280-07c7-11eb-8905-d7804fee2826.png) ![chr10_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95223689-de09c280-07c7-11eb-9aec-34c5665d0055.png) ![chr11_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95223690-de09c280-07c7-11eb-9df5-18b96d2e5445.png) ![chr12_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95223691-dea25900-07c7-11eb-9897-d4eae461fc85.png) ![chr13_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95223694-dea25900-07c7-11eb-8ad7-b1e3a53f0fa6.png) ![chr14_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95223697-df3aef80-07c7-11eb-9fd9-4d3df4a7815c.png) ![chr15_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95223698-df3aef80-07c7-11eb-912f-f6fdfe33c3ff.png) ![chr16_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95223699-dfd38600-07c7-11eb-8211-646b611e0a66.png) ![chr17_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95223701-dfd38600-07c7-11eb-911e-9695909e8bc8.png) ![chr18_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95223704-e06c1c80-07c7-11eb-8a16-b7f87f6aefa0.png) ![chr19_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95223705-e06c1c80-07c7-11eb-9604-7891cbe7dc5a.png) ![chr20_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95223708-e104b300-07c7-11eb-9f84-56bd9f0e9ad4.png) ![chr21_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95223709-e104b300-07c7-11eb-8e63-d5cc29cab49d.png) ![chr22_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95223712-e104b300-07c7-11eb-8b9b-f8facadc3c04.png) ![chr23_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95223716-e19d4980-07c7-11eb-8155-41d9ba2ae333.png)

GC applied twice GC applied to each cohort and then again to the meta-analysis results.
qq plots ![chr1_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95220971-00e6a780-07c5-11eb-82cf-799177c616c3.png) ![chr2_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95220974-017f3e00-07c5-11eb-9afe-ef1c2054cef7.png) ![chr3_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95220975-0217d480-07c5-11eb-8681-abbb2b11d4cc.png) ![chr4_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95220976-0217d480-07c5-11eb-8d06-4fc261f79c0c.png) ![chr5_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95220977-0217d480-07c5-11eb-9d72-058d55398316.png) ![chr6_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95220978-02b06b00-07c5-11eb-8d50-ab6c600f703c.png) ![chr7_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95220979-02b06b00-07c5-11eb-8860-7111d5b31a97.png) ![chr8_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95220981-02b06b00-07c5-11eb-84b2-e7859b4f82d8.png) ![chr9_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95220983-02b06b00-07c5-11eb-91c0-1355fdcff81e.png) ![chr10_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95220985-03490180-07c5-11eb-8920-ba2666be25fc.png) ![chr11_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95220986-03490180-07c5-11eb-9a02-5c898c17f1da.png) ![chr12_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95220989-03490180-07c5-11eb-99d9-0d089033c28e.png) ![chr13_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95220991-03e19800-07c5-11eb-9c32-aecb4f5c7a0e.png) ![chr14_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95220992-03e19800-07c5-11eb-87ff-e1399edbdc95.png) ![chr15_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95220994-047a2e80-07c5-11eb-831b-31a56b33bb0b.png) ![chr16_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95220995-047a2e80-07c5-11eb-9082-c321fd8ec4e6.png) ![chr17_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95220997-047a2e80-07c5-11eb-958a-faa5144f65fa.png) ![chr18_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95220998-047a2e80-07c5-11eb-8855-c372ca7fed84.png) ![chr19_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95220999-0512c500-07c5-11eb-920b-48e9cb0425eb.png) ![chr20_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95221001-0512c500-07c5-11eb-8dd3-716aca3e2b7e.png) ![chr21_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95221004-0512c500-07c5-11eb-9506-2705fdbe5871.png) ![chr22_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95221006-05ab5b80-07c5-11eb-9ffd-ef215f6e725c.png) ![chr23_gwas_plot snps+indels qq](https://user-images.githubusercontent.com/32715488/95221009-05ab5b80-07c5-11eb-92af-d856940bd0f4.png)


- [x] Perform meta on subset of cohorts
details Investigate whether any one cohort is disproportionately contributing to the inflation. * Leave-one-out approach that @danahancock suggested during our lab meeting.
![image](https://user-images.githubusercontent.com/32715488/95248302-19ff5080-07e5-11eb-89d7-dda7e241fade.png)
| | uhs_afr | uhs_eur | vidus_eur | wihs1_afr | wihs1_amr | wihs1_eur | wihs2_afr | wihs3_afr | |--------|---------|---------|-----------|-----------|-----------|-----------|-----------|-----------| | **lambda** | 1.32 | 1.244 | 1.25 | 1.235 | 1.297 | 1.239 | 1.053 | 1.056 |
UHS_AFR gwas plots
Leave UHS1–4 AFR out.
![hiv_acquisition_gwas_meta_cross snps+indels manhattan](https://user-images.githubusercontent.com/32715488/95246667-cee43e00-07e2-11eb-8dcc-f0c9801930f0.png) ![hiv_acquisition_gwas_meta_cross snps+indels qq](https://user-images.githubusercontent.com/32715488/95246669-cf7cd480-07e2-11eb-9589-5e798c60cb57.png)
UHS_EUR qq-plot
Leave UHS1–4 EUR out.
![hiv_acquisition_gwas_meta_cross snps+indels manhattan](https://user-images.githubusercontent.com/32715488/95246773-f6d3a180-07e2-11eb-9d6c-e1804d26998c.png) ![hiv_acquisition_gwas_meta_cross snps+indels qq](https://user-images.githubusercontent.com/32715488/95246777-f76c3800-07e2-11eb-934a-dfae2e0254c4.png)
VIDUS_EUR qq-plot
Leave VIDUS EUR out.
![hiv_acquisition_gwas_meta_cross snps+indels manhattan](https://user-images.githubusercontent.com/32715488/95246820-0ce16200-07e3-11eb-8662-a1bcf896eb56.png) ![hiv_acquisition_gwas_meta_cross snps+indels qq](https://user-images.githubusercontent.com/32715488/95246821-0d79f880-07e3-11eb-8804-6f0bde4a51e3.png)
WIHS1_AFR qq-plot
Leave WIHS1 AFR out.
![hiv_acquisition_gwas_meta_cross snps+indels manhattan](https://user-images.githubusercontent.com/32715488/95246873-1e2a6e80-07e3-11eb-941c-6b61b5d1ccc6.png) ![hiv_acquisition_gwas_meta_cross snps+indels qq](https://user-images.githubusercontent.com/32715488/95246874-1ec30500-07e3-11eb-9007-8605f310be65.png)
WIHS1_AMR qq-plot
Leave WIHS1 AMR out.
![hiv_acquisition_gwas_meta_cross snps+indels manhattan](https://user-images.githubusercontent.com/32715488/95246944-35695c00-07e3-11eb-8638-bc7c58065c1d.png) ![hiv_acquisition_gwas_meta_cross snps+indels qq](https://user-images.githubusercontent.com/32715488/95246947-369a8900-07e3-11eb-80a1-9f8065c8e41b.png)
WIHS1_EUR qq-plot
Leave WIHS1 EUR out.
![hiv_acquisition_gwas_meta_cross snps+indels manhattan](https://user-images.githubusercontent.com/32715488/95247036-57fb7500-07e3-11eb-9603-18bb8544297a.png) ![hiv_acquisition_gwas_meta_cross snps+indels qq](https://user-images.githubusercontent.com/32715488/95247037-58940b80-07e3-11eb-8818-ba0b8a0996d3.png)
WIHS2_AFR qq-plot
Leave WIHS2 AFR out.
![hiv_acquisition_gwas_meta_cross snps+indels manhattan](https://user-images.githubusercontent.com/32715488/95247084-6b0e4500-07e3-11eb-816d-86e75ace9327.png) ![hiv_acquisition_gwas_meta_cross snps+indels qq](https://user-images.githubusercontent.com/32715488/95247087-6c3f7200-07e3-11eb-85df-366aacfbb53a.png)
WIHS3_AFR qq-plot
Leave WIHS3 AFR out.
![hiv_acquisition_gwas_meta_cross snps+indels manhattan](https://user-images.githubusercontent.com/32715488/95247972-96456400-07e4-11eb-8ba2-37578e5bdee7.png) ![hiv_acquisition_gwas_meta_cross snps+indels qq](https://user-images.githubusercontent.com/32715488/95247975-96ddfa80-07e4-11eb-8a0d-398028270ed2.png)

- [x] WIHS2&3 results check
details
Recreate QQ and Manhattan plots to see if we can replicate what came out of the workflow. **This checks out.** | | workflow | meta | |-------|----------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------| | WIHS2 | ![image](https://user-images.githubusercontent.com/32715488/95380895-66b25c80-08b5-11eb-95cb-45543aadade1.png) | ![hiv_acquisition_gwas_meta_afr snps+indels qq](https://user-images.githubusercontent.com/32715488/95380943-7631a580-08b5-11eb-922b-3c09daabd9fc.png) | | WIHS3 | ![image](https://user-images.githubusercontent.com/32715488/95381002-8b0e3900-08b5-11eb-92c8-891c71bc7c22.png) | ![hiv_acquisition_gwas_meta_afr snps+indels qq](https://user-images.githubusercontent.com/32715488/95381036-95303780-08b5-11eb-9442-c23a4a593b60.png) |
jaamarks commented 4 years ago

TL;DR

(1) For the 1000g_p3 GWAS, all the WIHS3 subjects were also included within WIHS2.

(2) Including WIHS3 in the 1000g_p3 meta did inflate the results some, but it was not as pronounced as this current TOPMed meta.

(3) ProbABEL was used to run the 1000g_p3 GWAS for both WIHS2 and WIHS3 (as opposed to RVTEST which we used for the TOPMed GWAS). The individual GWAS results were deflated for both when using ProbABEL. This could explain why adding WIHS3 to the 1000g_p3 meta did not inflate the results as much as it did in the TOPMed meta. The inflation is most evident in the AFR-specific results (go figure).



All of WIHS3 was in WIHS2 for 1000g_p3

When comparing the phenotype files for WIHS2 and WIHS3 from the original 1000g_p3 results, we see that every subject in WIHS3 was within WIHS2.

command line ``` aws s3 cp s3://rti-hiv/gwas/wihs3/data/acquisition/0001/archive/phenotype/0001/final/wihs3_aa_hiv_age_sex_PC3+PC8+PC2.txt . aws s3 cp s3://rti-hiv/gwas/wihs2/data/acquisition/archive/pheno/WIHS2 HIV status GWAS baseline.csv . cut -d" " -f1 wihs3_aa_hiv_age_sex_PC3+PC8+PC2.txt | tail -n +2 |\ awk -F"_" '{print $2}' | xargs -I{} grep {} WIHS2\ HIV\ status\ GWAS\ baseline.csv ```


## Inflation observed when adding WIHS3 to 1000g_p3 meta We do observed some inflation when adding WIHS3 to the 1000g_p3 results. The lambda values do not increase as markedly as with the TOPMed imputed results, but one can observed, from visual inspection, a clear shift above the line y=x. The most appreciable inflation is observed within the AFR-specific meta-analysis. Click below to expand the 1000g_p3 vs TOPMed plots for each meta. It should also be noted that the TOPMed imputed data has more coverage. In particular, when comparing the final SNP count of the cross-ancestry meta-analysis (AFR+AMR+EUR) we see that the TOPMed results have over a million more observations. * 1000g_p3: N SNPs = 16,841,575 * TOPMed: N SNPs = 18,033,320

AFR+AMR+EUR | | without WIHS3 | with WIHS3 | |-----------|----------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------| | manhattan | ![hiv_acquisition_gwas_meta_cross snps+indels manhattan](https://user-images.githubusercontent.com/32715488/96202170-863c2b80-0f2c-11eb-827b-20c9a9aca945.png) | ![hiv_acquisition_gwas_meta_cross snps+indels manhattan](https://user-images.githubusercontent.com/32715488/96202186-92c08400-0f2c-11eb-9dbd-5abe2cc4bb7f.png) | | qq | ![hiv_acquisition_gwas_meta_cross snps+indels qq](https://user-images.githubusercontent.com/32715488/96202171-86d4c200-0f2c-11eb-8152-2829a6bd5be7.png) | ![hiv_acquisition_gwas_meta_cross snps+indels qq](https://user-images.githubusercontent.com/32715488/96202187-93591a80-0f2c-11eb-848b-9beedf7b0060.png) |
AFR+EUR | | without WIHS3 | with WIHS3 | |-----------|----------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------| | manhattan | ![hiv_acquisition_gwas_meta_cross snps+indels manhattan](https://user-images.githubusercontent.com/32715488/96263322-dd73e780-0f90-11eb-8937-9ce3010bcfc7.png) | ![hiv_acquisition_gwas_meta_cross snps+indels manhattan](https://user-images.githubusercontent.com/32715488/96263350-e795e600-0f90-11eb-8613-5be3bf855717.png) | | qq | ![hiv_acquisition_gwas_meta_cross snps+indels qq](https://user-images.githubusercontent.com/32715488/96263326-de0c7e00-0f90-11eb-807e-972123aec6ef.png) | ![hiv_acquisition_gwas_meta_cross snps+indels qq](https://user-images.githubusercontent.com/32715488/96263353-e8c71300-0f90-11eb-8a0d-9b2424c63d84.png) |
AFR-specific | | without WIHS3 | with WIHS3 | |-----------|----------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------| | manhattan | ![hiv_acquisition_gwas_meta_cross snps+indels manhattan](https://user-images.githubusercontent.com/32715488/96202563-90aaf500-0f2d-11eb-8e7c-ca0dc79062bd.png) | ![hiv_acquisition_gwas_meta_cross snps+indels manhattan](https://user-images.githubusercontent.com/32715488/96202535-7ec95200-0f2d-11eb-95b6-5530303c1fa5.png) | | qq | ![hiv_acquisition_gwas_meta_cross snps+indels qq](https://user-images.githubusercontent.com/32715488/96202564-91438b80-0f2d-11eb-87af-35cf68ddb4d3.png) | ![hiv_acquisition_gwas_meta_cross snps+indels qq](https://user-images.githubusercontent.com/32715488/96202538-7f61e880-0f2d-11eb-8ead-2123b972d374.png) |


## ProbABEL GWAS results were deflated WIHS2 and WIHS3 1000g_p3 were both ran using the ProbABEL GWAS software. This is apparent from looking at the summary stats headers and also the file names (e.g. palogist in the name which is a ProbABEL term for logistic regression). The lambda values for WIHS2 and WIHS3 were 1.017 and 1.01, respectively. Though these values do not raise any red flags, we do observe in both the Manhattan plots and the QQ plots below (click to expand) there are some striking similarities. For example, it is evident that the points in the QQ plots for both WIHS2 and WIHS3 are systematically shifted below the line y=x. This characteristic of being deflated could explain why the 1000g_p3 metas were not as markedly inflated as we observed within the TOPMed meta-analysis results.
1000g_p3 ProbABEL GWAS plots
**plot locations:** - WIHS3 (N=729) `s3://rti-hiv/gwas/wihs3/results/acquisition/0002/afr/archive/1df/aa/final/wihs3.aa.1df.1000G.hiv_acq.maf_gt_0.01.rsq_gt_0.3.assoc.plot.all_chr.snps+indels.*.png`
- WIHS2 (N=844) `s3://rti-hiv/gwas/wihs2/results/archive/wihs2.aa.palogist.status~snp+ageatbl+EV.plots.snps+indels.*.png.gz`

| | QQ | Manhattan | |-------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | WIHS2 | ![wihs2 aa palogist status~snp+ageatbl+EV plots snps+indels qq](https://user-images.githubusercontent.com/32715488/96029732-1002d080-0e29-11eb-95a6-8ee62d1a874f.png)

| ![wihs2 aa palogist status~snp+ageatbl+EV plots snps+indels manhattan](https://user-images.githubusercontent.com/32715488/96029728-0ed1a380-0e29-11eb-8b8b-852a3e8fbf7e.png) | | WIHS3 | ![wihs3 aa 1df 1000G hiv_acq maf_gt_0 01 rsq_gt_0 3 assoc plot all_chr snps+indels qq](https://user-images.githubusercontent.com/32715488/96029926-50fae500-0e29-11eb-9f14-35e1735e9a4c.png) | ![wihs3 aa 1df 1000G hiv_acq maf_gt_0 01 rsq_gt_0 3 assoc plot all_chr snps+indels manhattan](https://user-images.githubusercontent.com/32715488/96029921-4fc9b800-0e29-11eb-834c-c0315fb153a6.png) |