JonJala / mtag

Python command line tool for Multi-Trait Analysis of GWAS (MTAG)
GNU General Public License v3.0
172 stars 55 forks source link

error "divide by zero" #32

Open geneticsmcgill opened 6 years ago

geneticsmcgill commented 6 years ago

Hi there, I keep getting this error and I'm not sure why its occurring. Any help would be appreciated!

Conversion finished at Mon May 14 08:19:21 2018 Total time elapsed: 4.11s <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> Munging of Trait 2 complete. SNPs remaining: 143010 <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

Trait 2 summary statistics: 143010 SNPs after merging with previous traits. ... Merge of GWAS summary statistics complete. Number of SNPs: 143010 Using 121231 SNPs to estimate Omega (21779 SNPs excluded due to strand ambiguity) Estimating sigma.. /Users/X/mtag/ldsc_mod/ldscore/irwls.py:161: FutureWarning: rcond parameter will change to the default of machine precision times max(M, N) where M and N are the input matrix dimensions. To use the future default and silence this warning we advise to pass rcond=None, to keep using the old, explicitly pass rcond=-1. coef = np.linalg.lstsq(x, y) Checking for positive definiteness .. Sigma hat: [[1.074 0.056] [0.056 1.067]] Mean chi^2 of SNPs used to estimate Omega is low for some SNPsMTAG may not perform well in this situation. Beginning estimation of Omega ... Using GMM estimator of Omega .. Checking for positive definiteness .. matrix is not positive definite, performing adjustment.. Warning: max number of iterations reached in adjustment procedure. Sigma matrix used is still non-positive-definite. Completed estimation of Omega ... Beginning MTAG calculations... ... Completed MTAG calculations. Writing Phenotype 1 to file ... divide by zero encountered in true_divide Traceback (most recent call last): File "mtag.py", line 1359, in mtag(args) File "mtag.py", line 1249, in mtag save_mtag_results(args, res_temp,Zs,Ns, Fs,mtag_betas,mtag_se) File "mtag.py", line 778, in save_mtag_results out_df['mtag_beta'] = mtag_betas[:,p] / weights FloatingPointError: divide by zero encountered in true_divide Analysis terminated from error at Mon May 14 08:19:40 2018 Total time elapsed: 33.1s

omeed-maghzian commented 6 years ago

It looks like at least one of the values of the allele frequency column you provide is 0 or 1. Please check that it is the case and remove those SNPs, if possible.

On Mon, May 14, 2018 at 8:25 AM geneticsmcgill notifications@github.com wrote:

Hi there, I keep getting this error and I'm not sure why its occurring. Any help would be appreciated!

Conversion finished at Mon May 14 08:19:21 2018 Total time elapsed: 4.11s

<><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> Munging of Trait 2 complete. SNPs remaining: 143010

<><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

Trait 2 summary statistics: 143010 SNPs after merging with previous traits. ... Merge of GWAS summary statistics complete. Number of SNPs: 143010 Using 121231 SNPs to estimate Omega (21779 SNPs excluded due to strand ambiguity) Estimating sigma.. /Users/X/mtag/ldsc_mod/ldscore/irwls.py:161: FutureWarning: rcond parameter will change to the default of machine precision times max(M, N) where M and N are the input matrix dimensions. To use the future default and silence this warning we advise to pass rcond=None, to keep using the old, explicitly pass rcond=-1. coef = np.linalg.lstsq(x, y) Checking for positive definiteness .. Sigma hat: [[1.074 0.056] [0.056 1.067]] Mean chi^2 of SNPs used to estimate Omega is low for some SNPsMTAG may not perform well in this situation. Beginning estimation of Omega ... Using GMM estimator of Omega .. Checking for positive definiteness .. matrix is not positive definite, performing adjustment.. Warning: max number of iterations reached in adjustment procedure. Sigma matrix used is still non-positive-definite. Completed estimation of Omega ... Beginning MTAG calculations... ... Completed MTAG calculations. Writing Phenotype 1 to file ... divide by zero encountered in true_divide Traceback (most recent call last): File "mtag.py", line 1359, in mtag(args) File "mtag.py", line 1249, in mtag save_mtag_results(args, res_temp,Zs,Ns, Fs,mtag_betas,mtag_se) File "mtag.py", line 778, in save_mtag_results out_df['mtag_beta'] = mtag_betas[:,p] / weights FloatingPointError: divide by zero encountered in true_divide Analysis terminated from error at Mon May 14 08:19:40 2018 Total time elapsed: 33.1s

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/omeed-maghzian/mtag/issues/32, or mute the thread https://github.com/notifications/unsubscribe-auth/ATSq3NzE087F7uEAs6cjCJfZKg0WmGsnks5tyXefgaJpZM4T9sas .

-- Omeed Maghzian

geneticsmcgill commented 6 years ago

Thank you so much. I fixed that, however, now I get this error: Trait 2 summary statistics: 143010 SNPs after merging with previous traits. ... Merge of GWAS summary statistics complete. Number of SNPs: 143010 Using 121231 SNPs to estimate Omega (21779 SNPs excluded due to strand ambiguity) Estimating sigma.. /Users/X/mtag/ldsc_mod/ldscore/irwls.py:161: FutureWarning: rcond parameter will change to the default of machine precision times max(M, N) where M and N are the input matrix dimensions. To use the future default and silence this warning we advise to pass rcond=None, to keep using the old, explicitly pass rcond=-1. coef = np.linalg.lstsq(x, y) Checking for positive definiteness .. Sigma hat: [[1.062 1.062] [1.062 1.062]] Mean chi^2 of SNPs used to estimate Omega is low for some SNPsMTAG may not perform well in this situation. Beginning estimation of Omega ... Using GMM estimator of Omega .. Checking for positive definiteness .. matrix is not positive definite, performing adjustment.. Warning: max number of iterations reached in adjustment procedure. Sigma matrix used is still non-positive-definite. Completed estimation of Omega ... Beginning MTAG calculations... ... Completed MTAG calculations. Outputting standardized betas.. Writing Phenotype 1 to file ... Writing Phenotype 2 to file ... cannot convert float NaN to integer Traceback (most recent call last): File "mtag.py", line 1359, in mtag(args) File "mtag.py", line 1249, in mtag save_mtag_results(args, res_temp,Zs,Ns, Fs,mtag_betas,mtag_se) File "mtag.py", line 819, in save_mtag_results summary_df.loc[p+1, 'GWAS equiv. (max) N'] = int(summary_df.loc[p+1, 'N (max)']*(summary_df.loc[p+1, 'MTAG mean chi^2'] - 1) / (summary_df.loc[p+1, 'GWAS mean chi^2'] - 1)) ValueError: cannot convert float NaN to integer Analysis terminated from error at Tue May 15 01:08:08 2018 Total time elapsed: 47.96s

I checked for any non-numbers or missing values but I'm still getting this.

geneticsmcgill commented 6 years ago

Also, when I add a third trait though, it ends up working and not having that error... which confuses me.

huilisabrina commented 6 years ago

Hi @geneticsmcgill ,

Have you checked for NaN values in the sample size (N) column as well or just the allele frequency column? To your last question about adding the third trait, it is possible because MTAG uses the overlapped SNPs among the input sumstats, so it could be that the NaN values are excluded in this process.

Also, noticed that your sigma matrix has identical entries on and off the diagonal. Did you input the same sumstats for the two traits?

Please keep us updated on your testing. Thanks!

Best, Hui

geneticsmcgill commented 6 years ago

Hello,

I ended up fixing the problem. I had extra columns with other data and deleted it and it seemed that it was somehow messing up the script. I was wondering if there's going to be an option to not solely use rsIDs as identifiers? Is there going to be a way to just use chr / position? Some rsIDs may change over time and GWAS summary statistics don't update that

paturley commented 6 years ago

Ah. Glad you got it working.

The tricky thing on our part is that MTAG relies heavily on the LD score regression code, and their code doesn't seem to support that right now. https://github.com/bulik/ldsc/issues/63 If we can find time to write a work around at some point, I agree that it would be really useful.