Atkinson-Lab / Tractor

Scripts for implementing the Tractor pipeline
MIT License
44 stars 5 forks source link

Multi-ancestry meta-analysis #38

Closed sarahcolbert closed 1 month ago

sarahcolbert commented 1 month ago

Hi!

I've got a question that isn't so much related to a bug/issue with the software, but about the best way to implement the method in a cohort that will be included as part of a larger multi-ancestry meta-analysis.

In the methods of the paper it mentions that as an alternative to the joint model, separate GWAS can be run for each ancestry using the deconvolved tracts and "Results from the different ancestry runs could then be meta-analyzed to increase the power by incorporating summary statistics from both populations, although we recommend preferentially using the joint-analysis method described in this manuscript to avoid any potential bias from combining multiple ancestral portions of the genome of the same individuals."

So, I'm wondering what would be the best way to include a cohort with 3-way admixed individuals in both a) a multi-ancestry meta-analysis and b) ancestry specific meta-analyses. For b) it seems that the ancestry0 specific results from the joint model can be meta-analyzed with other cohorts of ancestry0 no problem (and so on for ancestries 1 and 2), but for a) I'm a bit lost if there's a way to include the cohort's ancestry0, 1, and 2 results (from the separate gwas, not the joint model) alongside each other in a larger meta-analysis? in the paper, results from the deconvolved segment meta-analysis are presented at some points, but the statement I mentioned above left me confused if this is not recommended.

Thanks in advance for any insight you can provide! -Sarah

eatkinson commented 1 month ago

Sorry I missed this post.

The short answer is either meta-analysis strategy should be alright. For e.g. the PGC-PTSD group, we typically wanted to report ancestry-specific GWAS sumstats, hence the suggestion about meta-analyzing just the e.g. AFR sumstats with other homogeneous AFR cohorts. So yes, b is what we primarily described in the paper.

If you just want a multi-ancestry meta-analysis though, you could meta-analyze all the ancestry-specific terms together. We did benchmarking of this in the Tractor manuscript as well, and found it also reduced the size of the credible set for GWAS peaks prior to fine-mapping thanks to the incorporation of multiple disruptions of LD, and did not add inflation. The one weird thing about this is exactly what you mention - the inclusion of multiple components from the same individual thing. While we did not find this to cause problems in our preliminary tests, we cannot rule out that it could, so we noted that caveat.

Hope that helps! Elizabeth