weizhouUMICH / SAIGE

GNU Lesser General Public License v3.0
188 stars 73 forks source link

"Error in X %*% Z : non-conformable arguments" error for Continuous Trait Analysis #318

Closed bensesbg closed 2 years ago

bensesbg commented 3 years ago

Hello! I am reaching out to you to seek help in analyzing what could be going wrong in a continuous trait analysis. The circumstances are very similar to an issue ticket I've recently submitted here: https://github.com/weizhouUMICH/SAIGE/issues/316

However, this run appears to generate a "non-conformable arguments" error as opposed to the empty output file which was the result in the aforementioned GitHub issue. We thought it might be better to separate out this into a separate issue for you to facilitate easier tracking.

As before, this is a large scale population (200k+) WES VCF for a single chromosome. This issue is occurring in a different chromosome VCF than the other GitHub issue, however the files were pre-processed in the same manner, i.e. splitting a WES VCF into chromosome chunks with Bcftools and PLINK2, masking genotypes (set to missing, “./.“) based on GQ and DP FORMAT field values, stripping all FORMAT fields other than GT, splitting and realigning multi-allelic sites, removing some variants based on GT missingness/HWE cutoffs, and recalculating the INFO fields. Of the 13 duplicate variants created across the chunks during multi-allelic splitting/realignment, none of them occurred in the VCF chunk used for this run.

The output for step 2 in this run is as follows:

R version 3.6.3 (2020-02-29) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 20.04.1 LTS

Matrix products: default BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3

locale: [1] C

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] SAIGE_0.43.3

loaded via a namespace (and not attached): [1] compiler_3.6.3 Matrix_1.2-18 Rcpp_1.0.5 grid_3.6.3
[5] RcppParallel_5.0.2 lattice_0.20-40
$vcfFile [1] "chr10.vcf.gz"

$vcfFileIndex [1] "chr10.vcf.gz.tbi"

$vcfField [1] "GT"

$bgenFile [1] ""

$bgenFileIndex [1] ""

$savFile [1] ""

$savFileIndex [1] ""

$idstoExcludeFile [1] ""

$idstoIncludeFile [1] ""

$rangestoExcludeFile [1] ""

$rangestoIncludeFile [1] ""

$chrom [1] "0"

$start [1] 1

$end [1] 2.5e+08

$IsDropMissingDosages [1] FALSE

$minMAF [1] 0

$minMAC [1] 0

$maxMAFforGroupTest [1] 0.01

$minInfo [1] 0

$sampleFile [1] ""

$GMMATmodelFile [1] "chr10.rda"

$varianceRatioFile [1] "chr10.varianceRatio.txt"

$SAIGEOutputFile [1] "chr10_saige_step2.txt"

$numLinesOutput [1] 10000

$IsSparse [1] TRUE

$SPAcutoff [1] 2

$IsOutputAFinCaseCtrl [1] FALSE

$IsOutputNinCaseCtrl [1] FALSE

$IsOutputHetHomCountsinCaseCtrl [1] FALSE

$LOCO [1] FALSE

$condition [1] ""

$sparseSigmaFile [1] ""

$groupFile [1] "chr10.gF.PTV"

$kernel [1] "linear.weighted"

$method [1] "optimal.adj"

$weights.beta.rare [1] "1,25"

$weights.beta.common [1] "1,25"

$weightMAFcutoff [1] 0.01

$r.corr [1] "0"

$IsSingleVarinGroupTest [1] TRUE

$cateVarRatioMinMACVecExclude [1] "0.5,1.5,2.5,3.5,4.5,5.5,10.5,20.5"

$cateVarRatioMaxMACVecInclude [1] "1.5,2.5,3.5,4.5,5.5,10.5,20.5"

$dosageZerodCutoff [1] 0.2

$IsOutputPvalueNAinGroupTestforBinary [1] FALSE

$IsAccountforCasecontrolImbalanceinGroupTest [1] TRUE

$weightsIncludeinGroupFile [1] FALSE

$IsOutputBETASEinBurdenTest [1] TRUE

$sampleFile_male [1] ""

$X_PARregion [1] ""

$is_rewrite_XnonPAR_forMales [1] FALSE

$help [1] FALSE

weights.beta.rare is 1 25 weights.beta.common is 1 25 cateVarRatioMinMACVecExclude is 0.5 1.5 2.5 3.5 4.5 5.5 10.5 20.5 cateVarRatioMaxMACVecInclude is 1.5 2.5 3.5 4.5 5.5 10.5 20.5 group-based test will be performed Any dosages <= 0.2 for genetic variants with MAC <= 10 are set to be 0 in group tests Garbage collection 18 = 8+2+8 (level 2) ... 87.8 Mbytes of cons cells used (60%) 216.9 Mbytes of vectors used (65%) 160537 samples have been used to fit the glmm null model [1] "Leave-one-chromosome-out is not applied" variance Ratio is 1 1 1 1 1 1 1 1 isCondition is FALSE [W::hts_idx_load2] The index file is older than the data file: chr10.vcf.gz.tbi [W::hts_idx_load2] The index file is older than the data file: chr10.vcf.gz.tbi Open VCF done To read the field GT Number of meta lines in the vcf file (lines starting with ##): 46 Number of samples in the vcf file: 200643 200643 sample IDs are found in the vcf file [1] 200643 4 [1] "IID" "IndexInModel" "IndexDose.x" "IndexDose.y" 160537 samples were used in fitting the NULL glmm model and are found in sample file sparse kinship matrix is not used Missing dosages will be mean imputed for the analysis Analysis started at 1612484988 Seconds As minMAC is set to be 0, minMAC = 0.5 will be used minMAC: 0.5 minMAF: 0 Minimum MAF of markers to be tested is 1.557273e-06 It is a quantitative trait isCondition is FALSE Analysis started at 1612484988 Seconds genetic variants with 1.557273e-06 <= MAF <= 0.01 are included for gene-based tests It is a quantitative trait isCondition is FALSE geneID: HPSE2 [W::hts_idx_load2] The index file is older than the data file: chr10.vcf.gz.tbi [W::hts_idx_load2] The index file is older than the data file: chr10.vcf.gz.tbi std::size_t sample_size = marker_file.samples().size();200643 missing_cnt > 0! missing_cnt > 0! missing_cnt > 0! missing_cnt > 0! missing_cnt > 0! missing_cnt > 0! missing_cnt > 0! missing_cnt > 0! missing_cnt > 0! missing_cnt > 0! missing_cnt > 0! missing_cnt > 0! missing_cnt > 0! missing_cnt > 0! missing_cnt > 0! missing_cnt > 0! missing_cnt > 0! missing_cnt > 0! missing_cnt > 0! missing_cnt > 0! missing_cnt > 0! missing_cnt > 0! cntMarker: 22 isCondition is FALSE weights: 24.99439 24.99813 24.99439 24.64733 24.98692 24.99813 24.99439 24.94567 24.99813 24.99813 24.99626 24.99813 24.99439 24.99813 24.99813 24.99066 24.99813 24.99626 24.99813 24.99253 24.99813 24.99439 $p.value [1] 0.03045393

$param $param$p.val.each [1] 0.11100658 0.10795153 0.09938891 0.08693413 0.05882423 0.03660336 0.02180885

$param$q.val.each [1] 234641.9 238123.5 248568.3 265976.2 321681.6 408721.2 582452.3

$param$rho [1] 0.00 0.01 0.04 0.09 0.25 0.50 1.00

$param$minp [1] 0.02180885

$param$rho_est [1] 1

$p.value.resampling NULL

$Phi_sum [1] 221508.8

$Score_sum [1] -1079.63

$IsMeta [1] TRUE

$markerNumbyMAC [1] 10 2 5 1 1 1 0 2

$m [1] 22

$indexNeg integer(0)

time for SAIGE_SKAT_withRatioVec user system elapsed 1.204 0.748 0.848 [1] "OK1" Error in X %*% Z : non-conformable arguments Calls: SPAGMMATtest ... groupTest -> scoreTest_SAIGE_quantitativeTrait_sparseSigma

Execution halted

The commands for generating the GLMM model were the same as those used in the other GitHub issue:

Rscript step1_fitNULLGLMM.R --plinkFile=chr10 --phenoFile=pheno.csv --phenoCol=pheno --traitType=quantitative --invNormalize=TRUE --covarColList=sex,age,PC1-10 --sampleIDColinphenoFile=EID --nThreads=64 --LOCO=FALSE --outputPrefix=./chr10 --IsSparseKin=TRUE --sparseGRMFile=relatednessCutoff_0.125_2000_randomMarkersUsed.sparseGRM.mtx --sparseGRMSampleIDFile=relatednessCutoff_0.125_2000_randomMarkersUsed.sparseGRM.mtx.sampleIDs.txt --isCateVarianceRatio=TRUE

Thank you very much for any assistance you might be able to provide regarding this.

chenming9453 commented 3 years ago

Hi, I actually got the same error. I think it is because they are changing the code right now. Hope to hear from them soon about the solution to this problem!

weizhouUMICH commented 3 years ago

Hi @chenming9453 and @bensesbg,

This issue has been fixed since version 0.44.1 on Feb 16. Please feel free to try.

Thanks, Wei

weizhouUMICH commented 2 years ago

We have just released a new version 1.0.0. It has substantial computational efficiency improvements for both Step 1 and Step 2 for single-variant and set-based tests and clearer log output. We have created a new program github page https://github.com/saigegit/SAIGE with the documentation provided https://saigegit.github.io/SAIGE-doc/ The program will be maintained by multiple SAIGE developers there. The docker image has been updated. Please feel free to try the version 1.0.0 and report issues if any.

Thanks! Wei