question on resources - Githubissues

omerwe / polyfun

PolyFun (POLYgenic FUNctionally-informed fine-mapping)

MIT License

85 stars 21 forks source link

question on resources #129

Closed bnj50 closed 1 year ago

bnj50 commented 1 year ago

Hi Do you have already prepared resources that include A1 nd A2 as described below? I only have LDSC resources

The .annot files must contain two additional columns called A1,A2 which encode the identifies of the effect and alternative allele (i.e., the sign of effect sizes is with respect to A1) The .l2.ldscore files may contain the additional columns A1,A2. We strongly encourage including these columns.

thanks!

omerwe commented 1 year ago

Hi, please see the top of Section 2 of the Wiki for instructions on where to get the data.

bnj50 commented 1 year ago

sorry again... i am getting this error and the link https://alkesgroup.broadinstitute.org/LDSCORE/ 1000G_Phase3_baselineLD_v2.2_ldscores
doesnt have a version that has these two additional columns: do you have a resource that provide this?

[INFO] Reading reference panel LD Score from 1000G_Phase3_baselineLD_v2.2_ldscores/baselineLD.[1-22] ... [WARNING] 1000G_Phase3_baselineLD_v2.2_ldscores/baselineLD.1.l2.ldscore.gz doesn't have A1,A2 columns

omerwe commented 1 year ago

Hi, we don't have files in the new format based on 1000G. 1000G has a very low sample size and we don't recommend using it for fine-mapping (please see the manuscript for more details). Sorry I can't help more...

bnj50 commented 1 year ago

so how to prepare these two files (annotation and weight with two additional columns of A1 nd A2) according to wiki page?

Run PolyFun with L2-regularized S-LDSC In this stage PolyFun will estimate per-SNP heritabilities for SNPs on odd (resp. even) chromosomes by applying L2-regularized S-LDSC to even (resp. odd) chromosomes. To do this, run the script polyfun.py. This script handles all possible uses of PolyFun, but here we'll only compute prior causal probabilities with L2-extended S-LDSC, using a subset of the baseline-LF model annotations. Here is an example command, that uses 8 annotations from the baseline-LF model:

mkdir -p output

python polyfun.py \ --compute-h2-L2 \ --no-partitions \ --output-prefix output/testrun \ --sumstats example_data/sumstats.parquet \ --ref-ld-chr example_data/annotations. \ --w-ld-chr example_data/weights.

omerwe commented 1 year ago

Hi, if you're going to use 1000G you might as well not include the allele columns. The results are not going to be reliable anyway. You can proceed as normal (this is just a warning message).

bnj50 commented 1 year ago

thank you...well, in any case, the program stop working after warning...

omerwe commented 1 year ago

Thanks, the program shouldn't stop working without an error message... Are you sure that there's no error message? Can you send me the full output from your run? Perhaps your machine ran out of memory? If you can send me a reproducible example I'm happy to take a look, but otherwise I'm afraid that there's not much I can do...