freeseek / score

Tools to work with GWAS-VCF summary statistics files
MIT License
94 stars 6 forks source link

Add Sample Overlap Correction for Metal #11

Open jjfarrell opened 1 month ago

jjfarrell commented 1 month ago

Could the sample overlap correction be added to the Metal plugin?

Several European research groups release their meta association results which include our collected samples. We would like to include their European samples in our meta analyses but those separate summary association results are not available to us. So the overlap correction would be useful in this situation.

From the Metal documentation: Sample Overlap Correction Correction for sample overlap in sample size weighted meta-analysis (developed by Sebanti Sengupta and implemented by Daniel Taliun).

First, METAL estimates the number of individuals that are common among two or more studies based on Z-statistics from each study. Then, METAL adjusts for sample overlap when calculating overall Z-statistics by correcting the weights with the estimated number of individuals in common.

To enable correction for sample overlap in your sample size weighted meta-analysis, use OVERLAP ON command (valid only with SCHEME SAMPLESIZE). By default, METAL uses Z-statistics <1 for esimating the number of individuals that are common among studies. To change this threshold, use ZCUTOFF [number] command.

freeseek commented 1 month ago

The main hurdle is that BCFtools/metal as a plugin acts one variant at a time although this seems an important feature. I will look into this and see how complicated it would be to implement. Are there any resources where I can find directions beyond this?

jjfarrell commented 1 month ago

Maybe this will help. This is the C code that METAL uses: https://github.com/statgen/METAL/blob/master/metal/Main.cpp

There is a studyOverlap boolean variable that turns on the calculations which should help to find the relevant code.

To overcome the one variant at a time, maybe have an overlap plugin which estimates the overlap in one pass. Then the overlap info gets passed to the metal plugin somehow.

A power point slide intro is here:

https://genome.sph.umich.edu/w/images/f/f8/METAL_sample_overlap_2017-11-15.pptx

A detailed method description is described here: https://genome.sph.umich.edu/w/images/7/7b/METAL_sample_overlap_method_2017-11-15.pdf