statgen / METAL

Meta-analysis of genomewide association scans
Other
37 stars 11 forks source link

Allele flipping logic in METAL #9

Open weinstockj opened 3 years ago

weinstockj commented 3 years ago

After examining METAL output, I observed that it only outputs the following Allele1-Allele2 combinations:

A C A G A T C G T C T G

Examining the code, it looks like it will always flip the alleles when G is first (https://github.com/statgen/METAL/blob/master/metal/Main.cpp#L469) .

I think this allele flipping behavior may be surprising to most users. If meta-analyzing two files with no allele flipping discrepancies between them, and where Allele1 = REF and Allele2 = ALT, the METAL output won't preserve this ordering by default.

Thanks for your consideration, Josh

welchr commented 3 years ago

I think there's some history here on why that behavior wasn't really an issue in the past.

Back when METAL was written, association results usually did not have REF or ALT annotated - analysts provided only an effect allele (and other allele), but they weren't particularly special or ordered. As a result, they usually varied across studies, and METAL would do the work of flipping them all to a consensus effect (and non-effect) allele.

When sequencing became more popular, variants started being annotated (and also sometimes named) with their REF and ALT alleles, but that was a couple of years after METAL's release.

My guess is that at this point, users of METAL are used to this behavior, despite it being non-optimal. Changing the default behavior might even break some users' post-processing scripts (I doubt it, but you never know). Maybe it would be possible to enable such behavior behind a flag, like --preserve-allele-order, which would ensure that users re-running METAL on old results would continue to get the results they expect from previous runs.

Anyway, as a temporary workaround: you could maybe set the variant IDs to something like "CHROM:POS:REF:ALT", and in a post-processing job, extract them into separate columns (then flip the effect allele / effect / freq to match the REF or ALT allele, whichever is preferred.)

Hopefully that helps a little. My recollection might be a little fuzzy here, Goncalo would know best.

weinstockj commented 3 years ago

The history is interesting - thank you! I think a --preserve-allele-order flag would be a great idea.

anuravi1234 commented 3 years ago

Hello, it will be great to have a --preserve-allele-order since we constantly face this issue. It also means an extra step to ensure that if the alleles have been flipped or not. I would like to know if this enhancement has been made now?

frahimov commented 2 years ago

Hi. I am happy that I finally found this thread. This creates issues with conditional analysis using GCTA and LD reference panel from one of the individual GWASs. How do you fix the METAL output to match allele ordering in the input files then? Is there a tool for this that you could recommend? I would highly appreciate any suggestions.

soren-rand commented 2 years ago

Like all other curious researchers here, I also find myself in need of a suggestion to handle concerns with allele flipping. Any suggestions from any reader? 😄

frahimov commented 2 years ago

This R package is useful to handle allele flipping, effect size and frequencies. https://bioconductor.org/packages/release/bioc/html/MungeSumstats.html