JonJala / mtag

Python command line tool for Multi-Trait Analysis of GWAS (MTAG)
GNU General Public License v3.0
169 stars 54 forks source link

ERROR converting summary statistics #140

Open ZhannaBal opened 3 years ago

ZhannaBal commented 3 years ago

Hi,

Could you please help me with this error:

2021/09/07/02:25:24 PM Reading sumstats from provided DataFrame into memory 10000000 SNPs at a time. 2021/09/07/02:26:25 PM Read 9804743 SNPs from --sumstats file. Removed 0 SNPs with missing values. Removed 0 SNPs with INFO <= None. Removed 2 SNPs with MAF <= 0.01. Removed 0 SNPs with SE <0 or NaN values. Removed 0 SNPs with out-of-bounds p-values. Removed 1155216 variants that were not SNPs. Note: strand ambiguous SNPs were not dropped. 8649525 SNPs remain. 2021/09/07/02:26:43 PM Removed 5787 SNPs with duplicated rs numbers (8643738 SNPs remain). 2021/09/07/02:26:51 PM Removed 0 SNPs with N < 0.0 (8643738 SNPs remain). 2021/09/07/02:32:32 PM Median value of SIGNED_SUMSTAT was -1.91694e-05, which seems sensible. 2021/09/07/02:32:37 PM Dropping snps with null values 2021/09/07/02:32:39 PM Metadata: 2021/09/07/02:32:43 PM Mean chi^2 = 1.164 2021/09/07/02:32:44 PM Lambda GC = 1.147 2021/09/07/02:32:45 PM Max chi^2 = 41.164 2021/09/07/02:32:45 PM 86 Genome-wide significant SNPs (some may have been removed by filtering). 2021/09/07/02:32:45 PM Conversion finished at Tue Sep 7 14:32:45 2021 2021/09/07/02:32:45 PM Total time elapsed: 7.0m:21.79s 2021/09/07/02:33:47 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2021/09/07/02:33:47 PM Munging of Trait 1 complete. SNPs remaining: 8649525 2021/09/07/02:33:47 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

2021/09/07/02:35:11 PM Trait 1: Dropped 5787 SNPs for duplicate values in the "snp_name" column 2021/09/07/02:36:16 PM Read in Trait 2 summary statistics (9890409 SNPs) from /project/silk/users/zhanna/depr_inflam/vitd_bolt_for_mtag.txt ... 2021/09/07/02:36:16 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2021/09/07/02:36:16 PM Munging Trait 2 <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><>< 2021/09/07/02:36:16 PM Interpreting column names as follows: 2021/09/07/02:36:16 PM EAF: Allele frequency A1: a1, interpreted as ref allele for signed sumstat. P: p-Value BETA: Directional summary statistic as specified by --signed-sumstats. A2: a2, interpreted as non-ref allele for signed sumstat. SNP: Variant ID (e.g., rs number) N: Sample size SE: Standard errors of BETA coefficients

2021/09/07/02:36:17 PM Reading sumstats from provided DataFrame into memory 10000000 SNPs at a time. 2021/09/07/02:36:52 PM WARNING: 65759 SNPs had P outside of (0,1]. The P column may be mislabeled. 2021/09/07/02:37:34 PM Read 9890409 SNPs from --sumstats file. Removed 0 SNPs with missing values. Removed 0 SNPs with INFO <= None. Removed 0 SNPs with MAF <= 0.01. Removed 0 SNPs with SE <0 or NaN values. Removed 65759 SNPs with out-of-bounds p-values. Removed 1157604 variants that were not SNPs. Note: strand ambiguous SNPs were not dropped. 8667046 SNPs remain. 2021/09/07/02:37:54 PM Removed 5984 SNPs with duplicated rs numbers (8661062 SNPs remain). 2021/09/07/02:38:00 PM Removed 0 SNPs with N < 0.0 (8661062 SNPs remain). 2021/09/07/02:38:05 PM ERROR converting summary statistics: 2021/09/07/02:38:05 PM Traceback (most recent call last): File "/rds/general/project/eph-prokopenko-lab-silk/live/shared/jgm18/bolt_sumphq9_bmi_t2d/ranges/mtag/mtag_munge.py", line 877, in munge_sumstats dat.P = p_to_z(dat.P, dat.N) File "/rds/general/project/eph-prokopenko-lab-silk/live/shared/jgm18/bolt_sumphq9_bmi_t2d/ranges/mtag/mtag_munge.py", line 517, in p_to_z return np.sqrt(chi2.isf(P, 1)) File "/rds/general/user/zbalkhiy/home/anaconda3/envs/mtag/lib/python2.7/site-packages/scipy/stats/_distn_infrastructure.py", line 1973, in isf place(output, cond, self._isf(goodargs) scale + loc) File "/rds/general/user/zbalkhiy/home/anaconda3/envs/mtag/lib/python2.7/site-packages/scipy/stats/_continuous_distns.py", line 1083, in _isf return sc.chdtri(df, p) TypeError: ufunc 'chdtri' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule '$

2021/09/07/02:38:05 PM Conversion finished at Tue Sep 7 14:38:05 2021 2021/09/07/02:38:05 PM Total time elapsed: 1.0m:49.18s 2021/09/07/02:38:05 PM ufunc 'chdtri' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the ca$ Traceback (most recent call last): File "/project/silk/shared/jgm18/bolt_sumphq9_bmi_t2d/ranges/mtag/mtag.py", line 1567, in mtag(args) File "/project/silk/shared/jgm18/bolt_sumphq9_bmi_t2d/ranges/mtag/mtag.py", line 1336, in mtag DATA_U, DATA, args = load_and_merge_data(args) File "/project/silk/shared/jgm18/bolt_sumphq9_bmi_t2d/ranges/mtag/mtag.py", line 269, in load_and_merge_data GWAS_d[p], sumstats_format[p] = _perform_munge(args, GWAS_d[p], gwas_dat_gen, p) File "/project/silk/shared/jgm18/bolt_sumphq9_bmi_t2d/ranges/mtag/mtag.py", line 162, in _perform_munge munged_results = munge_sumstats.munge_sumstats(argnames, write_out=False, new_log=False) File "/rds/general/project/eph-prokopenko-lab-silk/live/shared/jgm18/bolt_sumphq9_bmi_t2d/ranges/mtag/mtag_munge.py", line 877, in munge_sumstats dat.P = p_to_z(dat.P, dat.N) File "/rds/general/project/eph-prokopenko-lab-silk/live/shared/jgm18/bolt_sumphq9_bmi_t2d/ranges/mtag/mtag_munge.py", line 517, in p_to_z return np.sqrt(chi2.isf(P, 1)) File "/rds/general/user/zbalkhiy/home/anaconda3/envs/mtag/lib/python2.7/site-packages/scipy/stats/_distn_infrastructure.py", line 1973, in isf place(output, cond, self._isf(goodargs) scale + loc) File "/rds/general/user/zbalkhiy/home/anaconda3/envs/mtag/lib/python2.7/site-packages/scipy/stats/_continuous_distns.py", line 1083, in _isf return sc.chdtri(df, p) TypeError: ufunc 'chdtri' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule '$ 2021/09/07/02:38:05 PM Analysis terminated from error at Tue Sep 7 14:38:05 2021 2021/09/07/02:38:05 PM Total time elapsed: 13.0m:21.06s

Thanks in advance!

JonJala commented 3 years ago

It looks like this could be related to issue #19? Do you have P values that are very small?

ZhannaBal commented 2 years ago

Yes I have very small P-values for 117 SNPs, but after exclusion I rerun MTAG and still get the same error. What else can I do?

JonJala commented 2 years ago

You sure it's just 117? There is a line in your output \"2021/09/07/02:36:52 PM WARNING: 65759 SNPs had P outside of (0,1]. The P column may be mislabeled\"