precimed / mixer

Causal Mixture Model for GWAS summary statistics
GNU General Public License v3.0
65 stars 18 forks source link

mixer_figures.py combine: numpy.linalg.LinAlgError: singular matrix #58

Open Huifang-Xu opened 2 years ago

Huifang-Xu commented 2 years ago

Hi,

I tried to run bivariate analysis for multiple traits with MiXeR. One of the bivariate analyses fialed to produce results with the error meaasge "numpy.linalg.LinAlgError: singular matrix". No error message was generated in the two steps of fitting and applying model. All json files exist. image

The codes I used:

# Step 1: Fit the model for bivariate analysis
python /mixer/precimed/mixer.py fit2 \
        --trait1-file AN_31308545.a1effect.munge.rmInDels.uniq.noMHC.csv.gz \
        --trait2-file DHA.a1effect.munge.rmInDels.uniq.noMHC.csv.gz \
        --trait1-params-file AN_31308545_noMHC.fit.rep${SLURM_ARRAY_TASK_ID}.json \
        --trait2-params-file DHA_noMHC.fit.rep${SLURM_ARRAY_TASK_ID}.json \
        --out AN_31308545_vs_DHA_noMHC.fit.rep${SLURM_ARRAY_TASK_ID} \
        --extract /mixer/data/1000G.EUR.QC.prune_maf0p05_rand2M_r2p8.rep${SLURM_ARRAY_TASK_ID}.snps \
        --bim-file /mixer/data/1000G.EUR.QC.@.bim \
        --ld-file  /mixer/data/1000G.EUR.QC.@.run4.ld \
        --lib /mixer/src/build/lib/libbgmg.so       

# Step 2: Apply the model to the entire set of SNPs, without constraining to LDSR/w_hm3.justrs
python mixer/precimed/mixer.py test2 \
        --trait1-file AN_31308545.a1effect.munge.rmInDels.uniq.noMHC.csv.gz \
        --trait2-file DHA.a1effect.munge.rmInDels.uniq.noMHC.csv.gz \
        --load-params-file  AN_31308545_vs_DHA_noMHC.fit.rep${SLURM_ARRAY_TASK_ID}.json \
        --out  AN_31308545_vs_DHA_noMHC.fit.apply.rep${SLURM_ARRAY_TASK_ID} \
        --bim-file /mixer/data/1000G.EUR.QC.@.bim \
        --ld-file  /mixer/data/1000G.EUR.QC.@.run4.ld \
        --lib /mixer/src/build/lib/libbgmg.so

# Step 3: Visualize the results
python /mixer/precimed/mixer_figures.py combine \
        --json AN_31308545_vs_DHA_noMHC.fit.rep@.json \
        --out AN_31308545_vs_DHA_noMHC.fit.combined
python /mixer/precimed/mixer_figures.py combine \
        --json AN_31308545_vs_DHA_noMHC.fit.apply.rep@.json \
        --out AN_31308545_vs_DHA_noMHC.fit.apply.combined
python /mixer/precimed/mixer_figures.py two \
        --json-fit AN_31308545_vs_DHA_noMHC.fit.combined.json \
        --json-test AN_31308545_vs_DHA_noMHC.fit.apply.combined.json \
        --out AN_31308545_vs_DHA_noMHC \
        --statistic mean std

I would like to know why this error message occured and how to fix it. Any help is appreciated. Thank you!

Huifang

Huifang-Xu commented 2 years ago

Hi,

The following links contain the results, scripts, and log files for running bivariate analysis. https://github.com/Huifang-Xu/Practice/blob/main/mixer_bivariate_test_part1.tar.gz https://github.com/Huifang-Xu/Practice/blob/main/mixer_bivariate_test_part2.tar.gz https://github.com/Huifang-Xu/Practice/blob/main/mixer_bivariate_test_part3.tar.gz

Any help is appreciated. Thank you!

Best, Huifang

humanpaingeneticslab commented 2 years ago

Hi Huifang-Xu,

I get that error too. What I do is that I call 'mixer_figures.py combine' in a loop, each time removing one of the rep file; e.g.:

for repID in `seq  1  20`; do
echo "============================"
# try merging without that rep file
mv -v fit.A_vs_B.rep${repID}.json xxx
python3 mixer_figures.py \
  combine \
  --json fit.A_vs_B.rep@.json \
  --out  fit.A_vs_B
mv xxx fit.A_vs_B.rep${repID}.json
done

Now suppose that without rep7 the combine function didn't yield the 'singular matrix' error, then remove it entirely, e.g. mv fit.A_vs_B.rep7.json Xfit.A_vs_B.rep7.json

Then relaunch the combine like usual.

I wish I had a better solution. Kind of amazing to get a singular matrix with all that number crunching :-)

Huifang-Xu commented 2 years ago

Hi,

Thank you for your reply! I found another solution. When I change the parameter allow_singular=False in line 557 of the script precimed/mixer/figures.py to allow_singular=True, the error message does not appear again. But I don't know if changing this parameter will affect the results.

image

Thanks, Huifang

humanpaingeneticslab commented 2 years ago

Ah, yes, much better answer!

ermia1313 commented 1 year ago

I suppose you should change allow_singular=False to allow_singular=True in the following script:

python3.8/site-packages/scipy/stats/_multivariate.py

Huifang-Xu commented 1 year ago

Got it, thank you!