omerwe / polyfun

PolyFun (POLYgenic FUNctionally-informed fine-mapping)
MIT License
89 stars 22 forks source link

XtX is not symmetric #31

Closed trb8 closed 4 years ago

trb8 commented 4 years ago

While running finemapper.py, susieR sometimes crashes with the error: rpy2.rinterface_lib.embedded.RRuntimeError: Error in (function (bhat, shat, R, n, var_y = 1, XtX, Xty, yty, maf = NULL, : XtX is not a symmetric matrix.

I was able to fix it with the code below. It will be good to have a fix for this.

727 try: 728 #make LD matrix symmetric 729 ldm = np.triu(self.df_ld.values) 730 ldm = ldm + ldm.T - np.diag(np.diag(ldm)) 731 732 susie_obj = self.susieR.susie_suff_stat( 733 bhat=bhat.reshape((m,1)), 734 shat=np.ones((m,1)), 735 R=ldm, 736 n=self.n, 737 L=num_causal_snps, 738 scaled_prior_variance=(0.0001 if (prior_var is None) else prior_var), 739 estimate_prior_variance=(prior_var is None), 740 residual_variance=(self.R_null if (residual_var is None) else residual_var), 741 estimate_residual_variance=(residual_var is None), 742 verbose=verbose, 743 prior_weights=(prior_weights.reshape((m,1)) if use_prior_causal_prob else self.R_null) 744 )

omerwe commented 4 years ago

Hi,

Thanks for the suggestion. This fix indeed perturbs the matrix so that it becomes symmetric, but it should be symmetric to begin with... Can you please say what was the input that caused the original LD matrix to be non-symmetric?

On Tue, Aug 4, 2020 at 11:22 AM trb8 notifications@github.com wrote:

While running finemapper.py, susieR sometimes crashes with the error: rpy2.rinterface_lib.embedded.RRuntimeError: Error in (function (bhat, shat, R, n, var_y = 1, XtX, Xty, yty, maf = NULL, : XtX is not a symmetric matrix.

I was able to fix it with the code below. It will be good to have a fix for this.

727 try: 728 #make LD matrix symmetric 729 ldm = np.triu(self.df_ld.values) 730 ldm = ldm + ldm.T - np.diag(np.diag(ldm)) 731 732 susie_obj = self.susieR.susie_suff_stat( 733 bhat=bhat.reshape((m,1)), 734 shat=np.ones((m,1)), 735 R=ldm, 736 n=self.n, 737 L=num_causal_snps, 738 scaled_prior_variance=(0.0001 if (prior_var is None) else prior_var), 739 estimate_prior_variance=(prior_var is None), 740 residual_variance=(self.R_null if (residual_var is None) else residual_var), 741 estimate_residual_variance=(residual_var is None), 742 verbose=verbose, 743 prior_weights=(prior_weights.reshape((m,1)) if use_prior_causal_prob else self.R_null) 744 )

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/omerwe/polyfun/issues/31, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACNCB45HSRFY66YVUVDK5R3R7ARUNANCNFSM4PUQQHOA .

trb8 commented 4 years ago

Thanks. This happens while running using ukbb 380k samples plink files on a few intervals below. BTW the solution I suggested also fails in some cases with: finemapper.py:735: RuntimeWarning: overflow encountered in add ldm = ldm + ldm.T - np.diag(np.diag(ldm)) ... rpy2.rinterface_lib.embedded.RRuntimeError: Error in (function (bhat, shat, R, n, var_y = 1, XtX, Xty, yty, maf = NULL, : XtX matrix contains NA.

So, maybe something like this may solve it(?) i_lower = np.tril_indices(n, -1) matrix[i_lower] = matrix.T[i_lower]

chr1.121000001_124000001.sh.err: XtX is not a symmetric matrix. chr14.17000001_20000001.sh.err: XtX is not a symmetric matrix. chr21.48000001_51000001.sh.err: XtX is not a symmetric matrix. chr21.7000001_10000001.sh.err: XtX is not a symmetric matrix. chr2.93000001_96000001.sh.err: XtX is not a symmetric matrix.

trb8 commented 4 years ago

PS here is the debugging data for an examples https://www.dropbox.com/s/coz66fcinfusk8m/debug.tar.gz?dl=0

The upper and lower tri differ at some extreme values

omerwe commented 4 years ago

Thanks. It looks like the data has lots of monomorphic SNPs (40 out of 153). I updated the code a few days ago to better handle such SNPs. Can you please try to git pull the latest version and see if the problem persists?

Thanks,

Omer

On Wed, Aug 5, 2020 at 11:33 AM trb8 notifications@github.com wrote:

PS here is the debugging data for an examples https://www.dropbox.com/s/coz66fcinfusk8m/debug.tar.gz?dl=0

The upper and lower tri differ at some extreme values

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/omerwe/polyfun/issues/31#issuecomment-669262862, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACNCB4YTA6UH2NX4XMIFT4TR7F3UTANCNFSM4PUQQHOA .

trb8 commented 4 years ago

I just did a git pull and re-ran. Unfortunately still fails with the same error: rpy2.rinterface_lib.embedded.RRuntimeError: Error in (function (bhat, shat, R, n, var_y = 1, XtX, Xty, yty, maf = NULL, : XtX is not a symmetric matrix.

omerwe commented 4 years ago

Thanks. Did you re-estimate the LD matrix or use a pre-cached one (i.e., using the --cache-dir flag)? If you re-estimated it, can you please send the debug data again?

Thanks,

Omer

On Wed, Aug 5, 2020 at 2:21 PM trb8 notifications@github.com wrote:

I just did a git pull and re-ran. Unfortunately still fails with the same error: rpy2.rinterface_lib.embedded.RRuntimeError: Error in (function (bhat, shat, R, n, var_y = 1, XtX, Xty, yty, maf = NULL, : XtX is not a symmetric matrix.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/omerwe/polyfun/issues/31#issuecomment-669359270, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACNCB444K4F3LSJ3MVUSJLDR7GPKDANCNFSM4PUQQHOA .

trb8 commented 4 years ago

I re-estimated using plink files. Here is the debug data https://www.dropbox.com/s/7olz3q54n0o9g82/dbg_omer.tar.gz?dl=0

omerwe commented 4 years ago

Thanks. I looked into my code and found a bug in the LD computation... Can you please git pull and retry running again? Sorry for this... (LDStore used to compute LD matrices from plink but they stopped supporting this in the latest version so I had to write my own code from scratch...).

On Wed, Aug 5, 2020 at 2:42 PM trb8 notifications@github.com wrote:

I re-estimated using plink files. Here is the debug data https://www.dropbox.com/s/7olz3q54n0o9g82/dbg_omer.tar.gz?dl=0

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/omerwe/polyfun/issues/31#issuecomment-669390806, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACNCB42D63DH673YNA2CY4TR7GRX7ANCNFSM4PUQQHOA .

trb8 commented 4 years ago

Great! Now it worked. Thank you so much Tushar

trb8 commented 4 years ago

One more (easy) request. Will it be possible to provide an option to write the full susie output as an R object. Some of the values (e.g. alpha matrix) are useful for downstream work. Thanks.

omerwe commented 4 years ago

Good idea, I'll try to do that over the next few days.

On Sat, Aug 8, 2020 at 5:30 PM trb8 notifications@github.com wrote:

One more (easy) request. Will it be possible to provide an option to write the full susie output as an R object. Some of the values (e.g. alpha matrix) are useful for downstream work. Thanks.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/omerwe/polyfun/issues/31#issuecomment-670935631, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACNCB42SEKYXEMN6UNRRNZTR7VOQ7ANCNFSM4PUQQHOA .

trb8 commented 4 years ago

Thanks. I got it working with an additional argument susie_outfile and

    self.base = importr('base', robject_translations = {'print.me': 'print_dot_me', 'print_me': 'print_uscore_me'})

    if susie_outfile is not None:
        self.base.saveRDS(susie_obj, file=susie_outfile)
omerwe commented 4 years ago

Great idea, I just committed this (use with option --susie-outfile). Thanks for the great tip!