Al-Murphy / MungeSumstats

Rapid standardisation and quality control of GWAS or QTL summary statistics
https://doi.org/doi:10.18129/B9.bioc.MungeSumstats
75 stars 16 forks source link

Check Position by rsID rather than updating the rsID? #179

Closed erikstricker closed 7 months ago

erikstricker commented 7 months ago

Hi, I have some summary stats that are on hg19 and I would like to use the rsID to update to GRCh38. Is it possible for MSS to do that?

I saw that Step 9 should do that 9 Check if CHR and/or BP is missing If so, infer from the chosen reference genome

But when I run a sumstat I get:

Error in check_miss_data(sumstats_dt = sumstats_return$sumstats_dt, path = path,  : 
  All SNPs have been filtered out of  your summary statistics dataset
erikstricker commented 7 months ago

I solved the issue: I just had to remove the CHR and BP columns. What caused the issue was an AF column in this format: {AA:0.9311,EA:0.9992,EU:0.5112,HS:0.7893,SA:0.8981} and {EU:0.0308,HS:0.0058,SA:0.0051}

Al-Murphy commented 7 months ago

Hey! I would say it's best to just run format_sumstats() and use the parameters to deselect whatever checks you don't want - this will avoid any strange issues like above which I believe is because you are trying to run functions separately on your sumstats. Just a second note too, there has been a lot of functionality after the paper was published so do check out the vignette and other documentation for that too! Thanks!