omerwe / polyfun

PolyFun (POLYgenic FUNctionally-informed fine-mapping)
MIT License
89 stars 22 forks source link

assert len(np.unique(chr_num)) > 1 AssertionError #38

Closed yyshi1 closed 3 years ago

yyshi1 commented 3 years ago

Hi,

I tried to run the sample data successfully. Then, I tried to analyze a significant hit with UKB data downloaded. After renaming the files according to example_data names, I am able to run the first step: python munge_polyfun_sumstats.py \ --sumstats chr19.sumstat.txt.gz \ --n 637154 \ --out chr19.sumstat_munged.parquet \ --min-info 0.6 \ --min-maf 0.001

python polyfun.py \ --compute-h2-L2 \ --no-partitions \ --output-prefix 19_40853017_41853107_sumstats_munged \ --sumstats 19_40853017_41853107_sumstats_munged.parquet \ --ref-ld-chr ../baselineLF2.2.UKB/annotations. \ --w-ld-chr ../baselineLF2.2.UKB/weights.

python polyfun/polyfun.py \

--compute-h2-L2 \ --no-partitions \ --output-prefix 19_40853017_41853107_sumstats_munged \ --sumstats 19_40853017_41853107_sumstats_munged.parquet \ --ref-ld-chr ../baselineLF2.2.UKB/annotations. \ --w-ld-chr ../baselineLF2.2.UKB/weights.


  • PolyFun (POLYgenic FUNctionally-informed fine-mapping)
  • Version 1.0.0
  • (C) 2019-2021 Omer Weissbrod

[INFO] Reading summary statistics from 19_40853017_41853107_sumstats_munged.parquet ... [INFO] Read summary statistics for 5042 SNPs. [INFO] Reading reference panel LD Score from ../baselineLF2.2.UKB/annotations.[1-22] ... 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████| 22/22 [05:01<00:00, 13.69s/it] [INFO] Read reference panel LD Scores for 19386297 SNPs. [INFO] Reading regression weight LD Score from ../baselineLF2.2.UKB/weights.[1-22] ... 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████| 22/22 [02:08<00:00, 5.83s/it] [INFO] Read regression weight LD Scores for 18275613 SNPs. [INFO] After merging with reference panel LD, 4791 SNPs remain. [INFO] After merging with regression SNP LD, 4784 SNPs remain. [WARNING] number of SNPs is smaller than 200k; this is almost always bad. Traceback (most recent call last): File "polyfun/polyfun.py", line 848, in polyfun_obj.polyfun_main(args) File "polyfun/polyfun.py", line 772, in polyfun_main self.polyfun_h2_L2(args) File "polyfun/polyfun.py", line 594, in polyfun_h2_L2 self.run_ldsc(args, use_ridge=True, nn=False, evenodd_split=False, keep_large=False) File "polyfun/polyfun.py", line 231, in run_ldsc nnls_exact=args.nnls_exact File "polyfun/ldsc_polyfun/regressions.py", line 414, in init nnls_exact=nnls_exact File "/group/tools/polyfun/ldsc_polyfun/regressions.py", line 257, in init skip_ridge_jackknife=skip_ridge_jackknife, num_chr_sets=num_chr_sets) File "polyfun/ldsc_polyfun/jackknife.py", line 582, in init assert len(np.unique(chr_num)) > 1 AssertionError

Any suggestion?

omerwe commented 3 years ago

Hi,

PolyFun needs genome-wide data to estimate functional enrichment. Just data from the target locus is not enough. If you don't want to compute functional enrichment from genome-wide data yourself, you can use the meta-analyzed values that we provide (please see the wiki for details).

I hope it's clear, please reopen and let me know if not.

Best,

Omer

yyshi1 commented 3 years ago

Hi Omer,

Thank you for explaining. I do have whole genome GWAS result and will do as you suggested.

Best regards,

Yunling

Sent from my iPhone

On Feb 20, 2021, at 1:48 PM, Omer Weissbrod notifications@github.com wrote:

 Closed #38.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.