Closed rootze closed 1 year ago
Hey!
Great to hear you are finding the package useful!
Firstly, I would highly recommend installing R 4.2 and Bioconductor 3.16 to install MSS >=v1.6.0. This version has far more functionality than 1.5 that you are using - including a choice of dbSNP versions so you can use the latest version (155) as well as 144.
On your specific problem - the issue you are seeing with check_allele_flip
is as expected, we can't flip the effect of a SNP if the direction is incorrect if there is another alternative allele i.e. 1-effect would not be correct in this instance. You correctly state to use allele_flip_frq
to avoid this error which will stop this check and leave flip but like you say then the MAF value may not be correct. Another option would be to use allele_flip_drop
which will drop any SNPs that are non-bi-allelic and that need to be flipped. I would suggest this approach if you need to use the MAF values downstream.
A broader question is to whether you want to remove non-bi-allelic SNPs all together. This is discussed more here but in short, general advice for downstream analysis is to remove them. Whether the large drop in remaining SNPs will make a difference in downstream analysis is something we are currently testing.
Hope this helps, Alan.
@Al-Murphy Thank you for your help. That made sense.
Thanks for developing this awesome tool. I have a question regarding dropping a large number of SNPs in GWAS summary statistics. I used the AD GWAS (PubMed ID: 35379992) https://www.ebi.ac.uk/gwas/publications/35379992
Version of MungeSumstats -- MungeSumstats_1.5.18
When running with bi_allelic_filter = FALSE as something I can try after reading: Issue Large number of non-biallelic SNPs #111, I get an error as follow:
With the above error suggested setting allele_flip_frq to FALSE. (with bi_allelic_filter = FALSE, allele_flip_drop = FALSE,)
Although 93.7% of original SNPs were kept, the FRQ is not all MAF, as I understand anymore. I thought MAF should be minor allele frequency. But the following message suggested there are a lot of FRQ with values > 0.5. I am a bit confused right now and what I should use.
Just giving it a try, when I set the
bi_allelic_filter = TRUE
and I also setallele_flip_drop = FALSE
About 40% of the SNPs were dropped, including some lead SNPs.I am in a dilemma. And would be much appreciated your suggestion. Thank you!