bulik / ldsc

LD Score Regression (LDSC)
GNU General Public License v3.0
633 stars 342 forks source link

LinAlgError: Singular matrix #158

Open xtmgah opened 5 years ago

xtmgah commented 5 years ago

Hello, I tried to use the LDSC for Partitioned Heritability from Continuous Annotations, but we got the following error information. Can you help to check what's the reason for this kind of error? Thanks.


Beginning analysis at Mon Jul 8 09:36:20 2019 Reading summary statistics from /data/zhangt8/NSLC/GWAS/Meta/NSLC.sumstats.gz ... Read summary statistics for 1173749 SNPs. Reading reference panel LD Score from /data/zhangt8/NSLC/GWAS/LDSC/LD/MELA.[1-22] ... Read reference panel LD Scores for 1108605 SNPs. Removing partitioned LD Scores with zero variance. Reading regression weight LD Score from /data/zhangt8/NF_eQTL_ALL/LDSC/software/1000G_Phase3_EAS_weights_hm3_no_MHC/weights.EAS.hm3_noMHC.[1-22] ... Read regression weight LD Scores for 1071448 SNPs. After merging with reference panel LD, 983810 SNPs remain. After merging with regression SNP LD, 976517 SNPs remain. Removed 9 SNPs with chi^2 > 80 (976508 SNPs remain) Traceback (most recent call last): File "/data/zhangt8/NF_eQTL_ALL/LDSC/software/ldsc/ldsc.py", line 644, in sumstats.estimate_h2(args, log) File "/gpfs/gsfs7/users/zhangt8/NF_eQTL_ALL/LDSC/software/ldsc/ldscore/sumstats.py", line 359, in estimate_h2 twostep=args.two_step, old_weights=old_weights) File "/gpfs/gsfs7/users/zhangt8/NF_eQTL_ALL/LDSC/software/ldsc/ldscore/regressions.py", line 346, in init slow=slow, step1_ii=step1_ii, old_weights=old_weights) File "/gpfs/gsfs7/users/zhangt8/NF_eQTL_ALL/LDSC/software/ldsc/ldscore/regressions.py", line 208, in init jknife = jk.LstsqJackknifeFast(x, y, n_blocks) File "/gpfs/gsfs7/users/zhangt8/NF_eQTL_ALL/LDSC/software/ldsc/ldscore/jackknife.py", line 309, in init self.est = self.block_values_to_est(xty, xtx) File "/gpfs/gsfs7/users/zhangt8/NF_eQTL_ALL/LDSC/software/ldsc/ldscore/jackknife.py", line 386, in block_values_to_est return np.linalg.solve(xtx, xty).reshape((1, p)) File "/home/zhangt8/.local/lib/python2.7/site-packages/numpy/linalg/linalg.py", line 384, in solve r = gufunc(a, b, signature=signature, extobj=extobj) File "/home/zhangt8/.local/lib/python2.7/site-packages/numpy/linalg/linalg.py", line 90, in _raise_linalgerror_singular raise LinAlgError("Singular matrix") LinAlgError: Singular matrix

Analysis finished at Mon Jul 8 09:36:54 2019 Total time elapsed: 34.32s

xztrr commented 5 years ago

Hey, I have encountered the same error but I have not found a solution yet.

lamdan2 commented 4 years ago

I am getting this too. I am trying to do cell type specific partitioned heritability. Using the tutorial (https://github.com/bulik/ldsc/wiki/Cell-type-specific-analyses) with the Cahoy annotations from the tutorial and an Autism GWAS worked. When I tried to use annotations from Cusanovich (http://krishna.gs.washington.edu/content/members/ajh24/mouse_atlas_data_release/gwas_h2_enrichments/ld_score_regression_models.tar.gz) I get the singular matrix error. Please help!

choishingwan commented 4 years ago

I managed to by pass this error by using --n-blocks 1000 (change from the default value of 200). Think this change the size of the matrix used in the jackniving, which helps to avoid the matrix being singular. Not sure how will this affects the results as there doesn't seems to be enough information regarding the effect of this parameters on the analysis

leedobbyn commented 3 years ago

in my experience, this has been solved when I deal with annotation collinearity

LisaSikkema commented 3 years ago

Yes same for me, I accidentally set the --annot argument for the LD score calculation to the control annot file for all cell types:

python ldsc.py \
--l2 \
--annot ./LCA_v1_annots/control.${chr}.annot.gz \

etc., so that output was the same for my control (all genes) and my cell types.

haoyang-insitro commented 3 years ago

For me, I figured out the reason is because both the ref-w-ld-chr ld score and the ld scores defined in *.ldcts have the "Base" annotation that is all 1s. While this is needed for calculating partitioned heritability as discussed here, this causes collinearity between annotations (because there are two that are exactly the same)

MenglinC commented 1 year ago

Hi everyone,

I encountered the same problem. However,I try to add the parameter --n-blocks=1000 , It also does not work for me. Can anyone helps me ?

(ldsc) [xxzhang@mu02 Brain4Region]$ ldsc.py --h2-cts /home/xxzhang/data/Epigenome/gwas/Brain4Region/disease/AD.sumstats.gz --ref-ld-chr ./1000G_EUR_Phase3_baseline/baseline.  --out AD.Astro --ref-ld-chr-cts Astro.ldcts --w-ld-chr ./weights_hm3_no_hla/weights. --n-blocks 2000
*********************************************************************
* LD Score Regression (LDSC)
* Version 1.0.1
* (C) 2014-2019 Brendan Bulik-Sullivan and Hilary Finucane
* Broad Institute of MIT and Harvard / MIT Department of Mathematics
* GNU General Public License v3
*********************************************************************
Call:
./ldsc.py \
--h2-cts /home/xxzhang/data/Epigenome/gwas/Brain4Region/disease/AD.sumstats.gz \
--ref-ld-chr ./1000G_EUR_Phase3_baseline/baseline. \
--out AD.Astro \
--ref-ld-chr-cts Astro.ldcts \
--n-blocks 2000 \
--w-ld-chr ./weights_hm3_no_hla/weights.

Beginning analysis at Sun Dec 18 14:47:40 2022
Reading summary statistics from /home/xxzhang/data/Epigenome/gwas/Brain4Region/disease/AD.sumstats.gz ...
Read summary statistics for 1146109 SNPs.
Reading reference panel LD Score from ./1000G_EUR_Phase3_baseline/baseline.[1-22] ... (ldscore_fromlist)
Read reference panel LD Scores for 1190321 SNPs.
Removing partitioned LD Scores with zero variance.
Reading regression weight LD Score from ./weights_hm3_no_hla/weights.[1-22] ... (ldscore_fromlist)
Read regression weight LD Scores for 1242190 SNPs.
After merging with reference panel LD, 1138628 SNPs remain.
After merging with regression SNP LD, 1033318 SNPs remain.
Removed 12 SNPs with chi^2 > 444.006 (1033306 SNPs remain)
Reading cts reference panel LD Score from ./subcelltype/F-As/F-As.conserved.,./baseline/F-As.background.[1-22] ... (ldscore_fromlist)
Performing regression.
Reading cts reference panel LD Score from ./subcelltype/F-As/F-As.DFC.,./baseline/F-As.background.[1-22] ... (ldscore_fromlist)
Performing regression.
Reading cts reference panel LD Score from ./subcelltype/F-As/F-As.MFC.,./baseline/F-As.background.[1-22] ... (ldscore_fromlist)
Performing regression.
Reading cts reference panel LD Score from ./subcelltype/F-As/F-As.M1C.,./baseline/F-As.background.[1-22] ... (ldscore_fromlist)
Performing regression.
Traceback (most recent call last):
  File "/home/xxzhang/workplace/software/ldsc/ldsc.py", line 646, in <module>
    sumstats.cell_type_specific(args, log)
  File "/home/xxzhang/workplace/software/ldsc/ldscore/sumstats.py", line 300, in cell_type_specific
    twostep=None, old_weights=True)
  File "/home/xxzhang/workplace/software/ldsc/ldscore/regressions.py", line 346, in __init__
    slow=slow, step1_ii=step1_ii, old_weights=old_weights)
  File "/home/xxzhang/workplace/software/ldsc/ldscore/regressions.py", line 208, in __init__
    jknife = jk.LstsqJackknifeFast(x, y, n_blocks)
  File "/home/xxzhang/workplace/software/ldsc/ldscore/jackknife.py", line 309, in __init__
    self.est = self.block_values_to_est(xty, xtx)
  File "/home/xxzhang/workplace/software/ldsc/ldscore/jackknife.py", line 386, in block_values_to_est
    return np.linalg.solve(xtx, xty).reshape((1, p))
  File "/home/xxzhang/miniconda3/envs/ldsc/lib/python2.7/site-packages/numpy/linalg/linalg.py", line 403, in solve
    r = gufunc(a, b, signature=signature, extobj=extobj)
  File "/home/xxzhang/miniconda3/envs/ldsc/lib/python2.7/site-packages/numpy/linalg/linalg.py", line 97, in _raise_linalgerror_singular
    raise LinAlgError("Singular matrix")
LinAlgError: Singular matrix

Analysis finished at Sun Dec 18 14:48:40 2022
Total time elapsed: 59.89s
Traceback (most recent call last):
  File "/home/xxzhang/workplace/software/ldsc/ldsc.py", line 646, in <module>
    sumstats.cell_type_specific(args, log)
  File "/home/xxzhang/workplace/software/ldsc/ldscore/sumstats.py", line 300, in cell_type_specific
    twostep=None, old_weights=True)
  File "/home/xxzhang/workplace/software/ldsc/ldscore/regressions.py", line 346, in __init__
    slow=slow, step1_ii=step1_ii, old_weights=old_weights)
  File "/home/xxzhang/workplace/software/ldsc/ldscore/regressions.py", line 208, in __init__
    jknife = jk.LstsqJackknifeFast(x, y, n_blocks)
  File "/home/xxzhang/workplace/software/ldsc/ldscore/jackknife.py", line 309, in __init__
    self.est = self.block_values_to_est(xty, xtx)
  File "/home/xxzhang/workplace/software/ldsc/ldscore/jackknife.py", line 386, in block_values_to_est
    return np.linalg.solve(xtx, xty).reshape((1, p))
  File "/home/xxzhang/miniconda3/envs/ldsc/lib/python2.7/site-packages/numpy/linalg/linalg.py", line 403, in solve
    r = gufunc(a, b, signature=signature, extobj=extobj)
  File "/home/xxzhang/miniconda3/envs/ldsc/lib/python2.7/site-packages/numpy/linalg/linalg.py", line 97, in _raise_linalgerror_singular
    raise LinAlgError("Singular matrix")
numpy.linalg.LinAlgError: Singular matrix
MenglinC commented 1 year ago

in my experience, this has been solved when I deal with annotation collinearity

Please tell me how to "deal with annotation collinearity"……? this problem also confuses me.

MenglinC commented 1 year ago

Yes same for me, I accidentally set the --annot argument for the LD score calculation to the control annot file for all cell types:

python ldsc.py \
--l2 \
--annot ./LCA_v1_annots/control.${chr}.annot.gz \

etc., so that output was the same for my control (all genes) and my cell types.

Does this mean that you mix the cell type specific annot.gz files with the control annot.gz?So the output result was the same for your control and your cell types?

MenglinC commented 1 year ago

Hi,everyone I find the reason why we have the error!! Because the L2 in my ldscore.gz files are all zero!! And the core problem is the bed file I used to make anno is empty! So you may check the bed files if you meet with the same question. Hope this may help you!
图片