omerwe / polyfun

PolyFun (POLYgenic FUNctionally-informed fine-mapping)
MIT License
85 stars 21 forks source link

direction of beta remain the same despite allele switch #134

Closed bnj50 closed 1 year ago

bnj50 commented 1 year ago

Hi I have my sumstat (named =myfile.txt) with this header:

CHR SNP BP A1 A2 BETA SE p_value 1 rs3094315 752566 A G -.00830888 .02015612 .6801735

after running this script: bash-4.2$ python /usr/local/polyfun/1.0.0/ --sumstats myfile.txt --out ./myfile-out.txt --allow-missing

the output, named as myfile-out.txt is look like this:

CHR BP SNP A1 A2 BETA SE p_value SNPVAR 1 752566 rs3094315 G A -8.3089e-03 2.0156e-02 6.8017e-01 6.2623e-08

but as you can see the alleles has been switched but the direction of original beta is the same as before, so I can not use this beta(s) for downstream analyses... in other words, the direction of A1 only is for SNPVAR I correct?


omerwe commented 1 year ago

Hi, the problem is that requires files that were preprocessed by (with a Z-score column). I now modified the code of so that it explicitly fails if it doesn't find a column called "Z". If you git pull and then run again, you should get an error message...

Can you please first process the sumstats using, and then run on the output file?

bnj50 commented 1 year ago

Hi many sumstats, only have beta (se) , odds ratio and P values but not z score. can you modify the code to accept Beta and it is not easy to re-generate Z score, when you have million of markers

omerwe commented 1 year ago

Did you try using

bnj50 commented 1 year ago

yes, i did..but i can not read munge format. does the program accept Beta instead of z score ? In that case I just need to change the header name from Beta to Z?


omerwe commented 1 year ago

What do you mean by "i can not read munge format"? Why do you need to read it? Do you get any error message?

bnj50 commented 1 year ago

Hi I meant i can not open the munge file with textpad or note to read what exactly the alleles are ...also please comment on using beta column instead of Z score... thanks

omerwe commented 1 year ago

You don't need to look at the file, but in case you want to look at it for some reason you can run (from python):

import pandas as pd
df = pd.read_parquet(<sumstats_file>)

Then you'll have a dataframe with the file.

In any case, I'm not sure why you don't want to use the output of as is... You can just go ahead and use it in PolyFun...