hakyimlab / MetaXcan

MetaXcan software and manuscript
Other
143 stars 92 forks source link

numpy.linalg.LinAlgError: 0-dimensional array given. Array must be at least two-dimensional #150

Open RuochengDong opened 2 years ago

RuochengDong commented 2 years ago

Hello,

I was running the S-PrediXcan to conduct TWAS by using GWAS summary stats. I tried all 49 tissues from GTEx and got results for most tissues. But I got this error message for some tissues like the ovary or prostate. This is weird that I did get results from 32 tissues, but this error occurred for the rest tissues.

Does anyone know what happened behind this error message and what should I do to correct it?

INFO - 90 % of model's snps found so far in the gwas study Level 9 - Processing gene 13092:ENSG00000258301.3 Level 9 - Processing gene 13093:ENSG00000258308.5 Level 9 - Processing gene 13094:ENSG00000258315.5 Level 9 - Processing gene 13095:ENSG00000258354.1 Level 9 - Processing gene 13096:ENSG00000258366.7 Level 9 - Processing gene 13097:ENSG00000258405.9 Level 9 - Processing gene 13098:ENSG00000258429.1 Level 9 - Processing gene 13099:ENSG00000258472.8 Level 9 - Processing gene 13100:ENSG00000258476.5 Level 9 - Processing gene 13101:ENSG00000258479.5 Level 9 - Processing gene 13102:ENSG00000258484.3

Traceback (most recent call last): File "/xxxSoftware/MetaXcan/software/SPrediXcan.py", line 65, in     run(args) File "/xxx/Software/MetaXcan/software/SPrediXcan.py", line 31, in run     M04_zscores.run(args, g) File "/xxx/Software/MetaXcan/software/M04_zscores.py", line 70, in run     results = run_metaxcan(args, context) File "/xxx/Software/MetaXcan/software/M04_zscores.py", line 34, in run_metaxcan     r, snps = AssociationCalculation.association(gene, context, returnsnps=True) File "/xxx/Software/MetaXcan/software/metax/metaxcan/AssociationCalculation.py", line 60, in association     d = numpy.linalg.eig(cov)[0] File "<__array_function__ internals>", line 6, in eig File "/home/xxx/.local/lib/python3.6/site-packages/numpy/linalg/linalg.py", line 1316, in eig     _assert_stacked_2d(a) File "/home/xxx/.local/lib/python3.6/site-packages/numpy/linalg/linalg.py", line 198, in _assert_stacked_2d     'at least two-dimensional' % a.ndim) numpy.linalg.LinAlgError: 0-dimensional array given. Array must be at least two-dimensional

Thank you for your help.

RuochengDong commented 2 years ago

I was able to figure out why this error happen.

I downloaded the model data and covariance data from the PredictDB. The gene ENSG00000258484.3 was in the model file for prostate (mashr_Prostate.db) and my GWAS file, however, there is no covariance data (mashr_Prostate.txt.gz) for it. In this case, the covariance matrix was not given. So, I manually removed all SNPs that have caused this problem.

It would be great if you can add an exception in the code yourself. For example, I was running the SprediXcan for 49 tissues, but trouble SNPs and genes are different for each tissue.

Fnyasimi commented 2 years ago

Hi @RuochengDong do you have a list of those SNPs that were lacking covariance?

RuochengDong commented 2 years ago

Hi @RuochengDong do you have a list of those SNPs that were lacking covariance?

Yes. According to the errors I got: Adrenal_Gland: ENSG00000188487.11 Brain_Amygdala: ENSG00000130881.13 Brain_Cerebellum: ENSG00000258479.5 Brain_Cortex: ENSG00000162148.10 Brain_Frontal_Cortex_BA9: ENSG00000258484.3 Brain_Hypothalamus: ENSG00000258484.3 Brain_Spinal_cord_cervical_c-1: ENSG00000258484.3 Esophagus_Mucosa: ENSG00000258484.3 Kidney_Cortex: ENSG00000130881.13 Liver: ENSG00000258484.3 Lung: ENSG00000100599.15 Muscle_Skeletal: ENSG00000258484.3 Ovary: ENSG00000130881.13 Pancreas: ENSG00000258484.3 Prostate: ENSG00000258484.3 Spleen: ENSG00000258484.3 Testis: ENSG00000183662.10

jhchung commented 9 months ago

Sorry for bringing up this old thread, but I am running into the same issue as @RuochengDong. Have there been any updates yet or is the solution still to remove variants that are missing in the covariance data?