weizhouUMICH / SAIGE

GNU Lesser General Public License v3.0
187 stars 72 forks source link

Abnormal output of SAIGE step 2 with "idstoIncludeFile" parameter #393

Closed XiangyuYe closed 2 years ago

XiangyuYe commented 2 years ago

Hi Wei,

I tried to use SAIGE to conduct a single variant association test. However, when I used the "idstoIncludeFile" parameter to focus on a subset of variants (with 10 variants for example), it returned me a confusing output with 1 variant repeating 10 times.

I conducted this analysis with R-SAIGE on bioconda, as introducted in #272. The version of SAIGE is "0.44.6.4".

My code is: step2_SPAtests.R \ --bgenFile=${BGENFILE} \ --bgenFileIndex=${BGENFILEINDEX} \ --idstoIncludeFile=${IDSTOINCLUDEFILE} \ --chrom=${chr} \ --minMAF=0.0001 \ --minMAC=1 \ --LOCO=FALSE \ --sampleFile=${sampleIDindosage} \ --GMMATmodelFile=${output1}.rda \ --varianceRatioFile=${output1}.varianceRatio.txt \ --SAIGEOutputFile=${output2}.SAIGE.bgen.genotype.txt \ --numLinesOutput=1 \ --IsOutputAFinCaseCtrl=TRUE

My input of idstoIncludeFile looks like: rs9617528 rs715549 rs9617160 rs2186521 rs2027649 rs4911642 rs6010418 rs140378 rs131560 rs7287144

The output file looks like (the remaining lines are all the same as this one): CHR POS SNPID Allele1 Allele2 AC_Allele2 AF_Allele2 imputationInfo N BETA SE Tstat p.value p.value.NA Is.SPA.converge varT varTstar AF.Cases AF.Controls 22 16061016 22:16061016_T_C T C 158035.882353002 0.254201221100026 0.366613502966512 310848 -0.0079956036716851 0.0756549705372079 -1.39693540890738 0.915832196899293 0.915832196899293 1 174.712938042984 178.070185525266 0.253752956435493 0.254203081852324

I appreciate if you could help me fix this. Thank you!

Best regards, Xiangyu Ye

weizhouUMICH commented 2 years ago

Hi Xiangyu,

Thanks for reporting the issue. I think the current bioconda version has the query issue. Could you please try to install SAIGE locally?

Thanks, Wei

matuskosut commented 2 years ago

@XiangyuYe could you provide exact package versions with build numbers from conda? I need little bit more than: 0.44.6.4. I noticed that they have been rebuilding packages on their own with different R versions, which could have caused some issues.

Example how to list those:

conda list | grep -i saige
conda list | grep -i r-base
XiangyuYe commented 2 years ago

@XiangyuYe could you provide exact package versions with build numbers from conda? I need little bit more than: 0.44.6.4. I noticed that they have been rebuilding packages on their own with different R versions, which could have caused some issues.

Example how to list those:

conda list | grep -i saige
conda list | grep -i r-base

It returns me as follows:

$ conda list | grep -i saige
# packages in environment at /home/miniconda3/envs/saige:
r-saige                   0.44.6.5          r40h6d4de14_0    bioconda
$ conda list | grep -i base
libwebp-base              1.2.1                h7f98852_0    conda-forge
r-base                    4.0.5                h9e01966_1    conda-forge
weizhouUMICH commented 2 years ago

Hi all! We have just released a new version 1.0.0. It has computational efficiency improvements for both Step 1 and Step 2 for single-variant and set-based tests. We have created a new program github page https://github.com/saigegit/SAIGE with the documentation provided https://saigegit.github.io/SAIGE-doc/

The program will be maintained by multiple SAIGE developers there. Please feel free to try the version 1.0.0 and report issues if any.

Thanks! Wei