AlexTISYoung / snipar

Imputation of parental genotypes, inference of sibling IBD segments, family based GWAS, and polygenic score analyses.
MIT License
24 stars 5 forks source link

Singular matrix Error when running fGWAS.py #4

Closed yystat closed 4 years ago

yystat commented 4 years ago

Hello,

I had the following error when running GWAS using the observed and imputed data:

Estimating SNP effects
Traceback (most recent call last):
  File "/Software/SNIPar/fGWAS.py", line 257, in <module>
    alpha = np.linalg.solve(XTX,XTY)
  File "/Software/SNIPar/.eggs/numpy-1.16.6-py3.8-linux-x86_64.egg/numpy/linalg/linalg.py", line 403, in solve
    r = gufunc(a, b, signature=signature, extobj=extobj)
  File "/Software/SNIPar/.eggs/numpy-1.16.6-py3.8-linux-x86_64.egg/numpy/linalg/linalg.py", line 97, in _raise_linalgerror_singular
    raise LinAlgError("Singular matrix")
numpy.linalg.LinAlgError: Singular matrix

The code that I'm using is essentially identical to the one in your tutorial:

python fGWAS.py test_data/sample1 test_data/sample1.hdf5 test_data/h2_quad_0.8.ped test_data/h2_quad_0.8

My "observed" genotype data consists of only full siblings (no parents are present), and I first imputed their parental genotype (which I think is the average of the expected maternal and paternal genotype) using impute_runner.py. Then, I ran the GWAS using the above codes and encountered the Singular matrix error.

Am I using the correct codes/function to run the GWAS? Seems fGWAS.py requires separate maternal and paternal genotypes. For my data, it seems that fGWAS.py will make the 2nd and 3rd column identical (both of which = imputed average parental genotype) in the design matrix (variable G in the code), which will lead to singularity error when inverting the matrix XTX.

Thank you very much for your help! Josh

AlexTISYoung commented 4 years ago

Hi Josh,

Thanks for your email. This is an error I made when upgrading the code to be faster. I'll add an option in to regress on the individual and the sum of the imputed parental genotypes, which should fix this.

Alex.

On Sun, 26 Jul 2020 at 15:17, Josh notifications@github.com wrote:

Hello,

I had the following error when running GWAS using the observed and imputed data:

Estimating SNP effects Traceback (most recent call last): File "/Software/SNIPar/fGWAS.py", line 257, in alpha = np.linalg.solve(XTX,XTY) File "/Software/SNIPar/.eggs/numpy-1.16.6-py3.8-linux-x86_64.egg/numpy/linalg/linalg.py", line 403, in solve r = gufunc(a, b, signature=signature, extobj=extobj) File "/Software/SNIPar/.eggs/numpy-1.16.6-py3.8-linux-x86_64.egg/numpy/linalg/linalg.py", line 97, in _raise_linalgerror_singular raise LinAlgError("Singular matrix") numpy.linalg.LinAlgError: Singular matrix

The code that I'm using is essentially identical to the one in your tutorial https://sibreg.readthedocs.io/en/master/tutorial.html:

python fGWAS.py test_data/sample1 test_data/sample1.hdf5 test_data/h2_quad_0.8.ped test_data/h2_quad_0.8

My "observed" genotype data consists of only full siblings (no parents are present), and I first imputed their parental genotype (which I think is the average of the expected maternal and paternal genotype) using impute_runner.py. Then, I ran the GWAS using the above codes and encountered the Singular matrix error.

Am I using the correct codes/function to run the GWAS? Seems fGWAS.py requires separate maternal and paternal genotypes. For my data, it seems that fGWAS.py will make the 2nd and 3rd column identical (both of which = imputed average parental genotype) in the design matrix (variable G in the code), which will lead to singularity error when inverting the matrix XTX.

Thank you very much for your help! Josh

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/AlexTISYoung/SNIPar/issues/4, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABQQS6IW7QP23ZIMIBPF7XLR5STQFANCNFSM4PIGAFEQ .

AlexTISYoung commented 4 years ago

Hi Josh, I have updated the fGWAS.py script with the option --parsum, which performs the regression of proband phenotype onto proband genotype and sum of (imputed) maternal and paternal genotypes. This script outputs estimates of the direct and average parental (average of maternal and paternal) effects. Please let me know if you have any further issues.

yystat commented 4 years ago

It works! Thank you!