Closed lpgilchrist closed 2 years ago
Hey Lachlan,
Glad you are finding it useful. I'll need a few things to try and understand the issue, can you give me the version of MungeSumstats you are using? I will also need a small example dataset with code where you show how this is occurring and why you wouldn't expect it to happen for these SNPs. Think of it as similar to posts on stack overflow. Also you can use the log_folder_ind
parameter to save all the filtered out SNPs into txt files split based on the reason they were filtered out. Also use the imputation_ind
to show where data has been changed for a SNP. The later here can be used to show where the direction of a SNP has been flipped. You can use this to filter to the 661,360 SNPs where the A1 didn't match the reference that MSS has flipped.
Thanks, Alan.
Closing for now, if you do get the information above I need, feel free to re-open.
Thanks, Alan.
Hi,
First thanks for the amazing R package! It's a great way to ensure standardisation across multiple GWAS summary statistics and the sort of thing I have been looking for for quite a while.
I am currently running the format_sumstats() function chr by chr and have noticed I am getting a high percentage of SNPs where A1 does not match the reference genome, even though I know this is the reference allele and that the other allele – set as A2 for the purposes of MungeSumstats – is the effect allele.
This seems a bit unusual so just wanted to check.
In this case of 750,404 SNPs, A1 for 661,360 do not match the reference genome on chr 1.
I have included output from running on chr 1.
Thanks,
Lachlan