Al-Murphy / MungeSumstats

Rapid standardisation and quality control of GWAS or QTL summary statistics
https://doi.org/doi:10.18129/B9.bioc.MungeSumstats
75 stars 15 forks source link

dbSNP154 #171

Closed AhmedArslan closed 3 months ago

AhmedArslan commented 1 year ago

I am using MungSumstats package to standardize GWAS catalog data. MungSumstats uses dbSNP155 whereas at present GWAS Catalog uses dbSNP154, I wanted to ask how MungSumstats handles this issue?

Many thanks for doing great work.

best, Ahmed.

Al-Murphy commented 1 year ago

Hey @AhmedArslan,

You are right that this will affect any function used by MSS that uses the dbSNP reference dataset (which is quite a lot - look for any function calling load_ref_genome_data to get a list). However, and I haven't tested this but, I don't think there should be much change across dbSNP154 and dbSNP155 given the releases are so close so checks like check_on_ref_genome() shouldn't be affected by too large a degree. If you would like to test the effect though try running the on reference genome function (or any of the other checks using the reference genome) in isolation on a set of SNPs known to be dbSNP 154 and see how many are flagged as not on the reference genome. I would be really interested to see the results.

Thanks, Alan.

Al-Murphy commented 3 months ago

Closing due to inactivity, @AhmedArslan feel free to reopen if you work on this any more