weizhouUMICH / SAIGE

GNU Lesser General Public License v3.0
191 stars 73 forks source link

Step 2 got stuck with no error in log #100

Closed hanyi003 closed 4 years ago

hanyi003 commented 5 years ago

Hi,

Trying to run SNP association using UKBB for chr22. Step 1 is very fast, but step 2 took 3 days already. The log below looks OK to me, just got stuck somehow...It takes 1.5% CPU now. Is there anything wrong?

Many thanks, Yi

$dosageFile [1] ""

$dosageFileNrowSkip [1] 0

$dosageFileNcolSkip [1] 5

$dosageFilecolnamesSkip [1] ""

$vcfFile [1] ""

$vcfFileIndex [1] ""

$vcfField [1] "DS"

$bgenFile [1] "/staging/UKBB/ukb_imp_chr22_v3.bgen"

$bgenFileIndex [1] "/staging/UKBB/ukb_imp_chr22_v3.bgen.bgi"

$savFile [1] ""

$savFileIndex [1] ""

$chrom [1] ""

$start [1] 1

$end [1] 2.5e+08

$minMAF [1] 1e-04

$minMAC [1] 1

$sampleFile [1] "/staging/UKBB/ukb_imp_chr22_v3_Yi.sample"

$GMMATmodelFile [1] "./output/chr22_aa_all_yi_test.rda"

$varianceRatioFile [1] "./output/chr22_aa_all_yi_test.varianceRatio.txt"

$SAIGEOutputFile [1] "./output/UKBB_aa_all_yi_chr22.SAIGE.bgen.txt"

$numLinesOutput [1] 2

$IsOutputAFinCaseCtrl [1] TRUE

$LOCO [1] FALSE

$verbose [1] FALSE

$help [1] FALSE

471746 samples have been used to fit the glmm null model obj.glmm.null$LOCO: FALSE Leave-one-chromosome-out option is not applied variance Ratio is 1 487409 sample IDs are found in sample file [1] 487409 10 [1] "IID" "IndexInModel" "V2.x" "V3.x" "V4.x"
[6] "IndexDose.x" "V2.y" "V3.y" "V4.y" "IndexDose.y" 471746 samples were used in fitting the NULL glmm model and are found in sample file minMAC: 1 minMAF: 1e-04 Minimum MAF of markers to be testd is 1e-04 Analysis started at 1.56e+09 Seconds no query list is provided 487409 samples are found in the bgen file 1255683 markers are found in the bgen file

weizhouUMICH commented 5 years ago

Is there other information in the log file? --numLinesOutput = 2, so it is supposed to output the results of every two markers. And which version of SAIGE is being used? Thanks!

hanyi003 commented 5 years ago

No, this is all the info in the log. Do you have any suggestions for numLinesOutput that I should put? I am using SAIGE-0.29.5. I can change the parameter and resubmit the job.

Many thanks Yi

hanyi003 commented 5 years ago

Hi Zhou Wei,

I still can not solve the problem. Could there be any error in step 1? And what can be a good "numLinesOutput" in step 2? Below is the step 1 log:

Thanks, Yi

Loading required package: optparse $plinkFile [1] "/staging/ukb_cal_chr22_v2_remove"

$phenoFile [1] "/staging/ukb.txt"

$phenoCol [1] "aa_all_yi"

$traitType [1] "binary"

$invNormalize [1] FALSE

$covarColList [1] "age,male,genotyping_array"

$sampleIDColinphenoFile [1] "IID"

$numMarkers [1] 30

$nThreads [1] 4

$skipModelFitting [1] FALSE

$traceCVcutoff [1] 1

$ratioCVcutoff [1] 1

$LOCO [1] FALSE

$outputPrefix [1] "./output/chr22_aa_all_yi_test"

$help [1] FALSE

4 threads are set to be used
487409 samples have genotypes formula is aa_all_yi~age+male+genotyping_array 472681 samples have non-missing phenotypes 15663 samples in geno file do not have phenotypes 471746 samples will be used for analysis colnames(data.new) is Y minus1 age male genotyping_array out.transform$Param.transform$qrr: 4 4 aa_all_yi is a binary trait

Call: glm(formula = formula.new, family = binomial, data = data.new)

Coefficients: minus1 age male genotyping_array
3.72048 0.74633 0.71776 0.06788

Degrees of Freedom: 471746 Total (i.e. Null); 471742 Residual Null Deviance: 654000 Residual Deviance: 134800 AIC: 134800 [1] "Start reading genotype plink file here" nbyte: 121853 nbyte: 117937 reserve: 1529432960

M: 12968, N: 487409 size of genoVecofPointers: 1 here setgeno mark1 setgeno mark2 setgeno mark5 setgeno mark6 time: 75640 [1] "Genotype reading is done" inital tau is 1 0.5 iGet_Coef: 1 iter from getPCG1ofSigmaAndVector 41 iter from getPCG1ofSigmaAndVector 37

warning: inv_sympd(): given matrix is not symmetric Tau: [1] 1.0 0.5 Fixed-effect coefficients: [,1] [1,] 3.72009110 [2,] 0.75179225 [3,] 0.71415800 [4,] 0.02394382 iGet_Coef: 2 iter from getPCG1ofSigmaAndVector 51 iter from getPCG1ofSigmaAndVector 42 iter from getPCG1ofSigmaAndVector 44

warning: inv_sympd(): given matrix is not symmetric Tau: [1] 1.0 0.5 Fixed-effect coefficients: [,1] [1,] 3.77249861 [2,] 0.75585616 [3,] 0.71737510 [4,] 0.02595981 iter from getPCG1ofSigmaAndVector 41 iter from getPCG1ofSigmaAndVector 42 Variance component estimates: [1] 1.000000 0.498443

Iteration 1 1 0.498443 : iGet_Coef: 1 iter from getPCG1ofSigmaAndVector 41 iter from getPCG1ofSigmaAndVector 36

warning: inv_sympd(): given matrix is not symmetric Tau: [1] 1.000000 0.498443 Fixed-effect coefficients: [,1] [1,] 3.72009420 [2,] 0.75179338 [3,] 0.71415997 [4,] 0.02398654 iter from getPCG1ofSigmaAndVector 42 iter from getPCG1ofSigmaAndVector 43

Final 1 0 : iGet_Coef: 1 iter from getPCG1ofSigmaAndVector 1 iter from getPCG1ofSigmaAndVector 1

warning: inv_sympd(): given matrix is not symmetric Tau: [1] 1 0 Fixed-effect coefficients: [,1] [1,] 3.7204731 [2,] 0.7463344 [3,] 0.7177603 [4,] 0.0678791 iGet_Coef: 2 iter from getPCG1ofSigmaAndVector 1 iter from getPCG1ofSigmaAndVector 1

warning: inv_sympd(): given matrix is not symmetric Tau: [1] 1 0 Fixed-effect coefficients: [,1] [1,] 3.72047400 [2,] 0.74633265 [3,] 0.71776235 [4,] 0.06787937

Family: binomial Link function: logit

[1] "theta" "coefficients" "linear.predictors" [4] "fitted.values" "Y" "residuals"
[7] "cov" "converged" "sampleID"
[10] "obj.noK" "obj.glm.null" "traitType"
[13] "LOCO"
4 471746x1 2x1 iter from getPCG1ofSigmaAndVector 1 iter from getPCG1ofSigmaAndVector 1 i is 6956 iter from getPCG1ofSigmaAndVector 1 CV for variance ratio estimate using 30 markers is 2.002744e-09 < 1 varRatio 1 [1] 1 closed the plinkFile!

weizhouUMICH commented 5 years ago

Hi Yi,

The step 1 seems fine. Have you tried the example data in the extdata folder? Also would you mind trying the new version of SAIGE? 0.35.5.3?

Thanks, Wei

On Tue, Jun 25, 2019 at 6:42 PM hanyi003 notifications@github.com wrote:

Hi Zhou Wei,

I still can not solve the problem. Could there be any error in step 1? And what can be a good "numLinesOutput" in step 2? Below is the step 1 log: Thanks, Yi

Loading required package: optparse $plinkFile [1] "/staging/ukb_cal_chr22_v2_remove"

$phenoFile [1] "/staging/ukb.txt"

$phenoCol [1] "aa_all_yi"

$traitType [1] "binary"

$invNormalize [1] FALSE

$covarColList [1] "age,male,genotyping_array"

$sampleIDColinphenoFile [1] "IID"

$numMarkers [1] 30

$nThreads [1] 4

$skipModelFitting [1] FALSE

$traceCVcutoff [1] 1

$ratioCVcutoff [1] 1

$LOCO [1] FALSE

$outputPrefix [1] "./output/chr22_aa_all_yi_test"

$help [1] FALSE

4 threads are set to be used 487409 samples have genotypes formula is aa_all_yi~age+male+genotyping_array 472681 samples have non-missing phenotypes 15663 samples in geno file do not have phenotypes 471746 samples will be used for analysis colnames(data.new) is Y minus1 age male genotyping_array out.transform$Param.transform$qrr: 4 4 aa_all_yi is a binary trait

Call: glm(formula = formula.new, family = binomial, data = data.new)

Coefficients: minus1 age male genotyping_array 3.72048 0.74633 0.71776 0.06788

Degrees of Freedom: 471746 Total (i.e. Null); 471742 Residual Null Deviance: 654000 Residual Deviance: 134800 AIC: 134800 [1] "Start reading genotype plink file here" nbyte: 121853 nbyte: 117937 reserve: 1529432960

M: 12968, N: 487409 size of genoVecofPointers: 1 here setgeno mark1 setgeno mark2 setgeno mark5 setgeno mark6 time: 75640 [1] "Genotype reading is done" inital tau is 1 0.5 iGet_Coef: 1 iter from getPCG1ofSigmaAndVector 41 iter from getPCG1ofSigmaAndVector 37

warning: inv_sympd(): given matrix is not symmetric Tau: [1] 1.0 0.5 Fixed-effect coefficients: [,1] [1,] 3.72009110 [2,] 0.75179225 [3,] 0.71415800 [4,] 0.02394382 iGet_Coef: 2 iter from getPCG1ofSigmaAndVector 51 iter from getPCG1ofSigmaAndVector 42 iter from getPCG1ofSigmaAndVector 44

warning: inv_sympd(): given matrix is not symmetric Tau: [1] 1.0 0.5 Fixed-effect coefficients: [,1] [1,] 3.77249861 [2,] 0.75585616 [3,] 0.71737510 [4,] 0.02595981 iter from getPCG1ofSigmaAndVector 41 iter from getPCG1ofSigmaAndVector 42 Variance component estimates: [1] 1.000000 0.498443

Iteration 1 1 0.498443 : iGet_Coef: 1 iter from getPCG1ofSigmaAndVector 41 iter from getPCG1ofSigmaAndVector 36

warning: inv_sympd(): given matrix is not symmetric Tau: [1] 1.000000 0.498443 Fixed-effect coefficients: [,1] [1,] 3.72009420 [2,] 0.75179338 [3,] 0.71415997 [4,] 0.02398654 iter from getPCG1ofSigmaAndVector 42 iter from getPCG1ofSigmaAndVector 43

Final 1 0 : iGet_Coef: 1 iter from getPCG1ofSigmaAndVector 1 iter from getPCG1ofSigmaAndVector 1

warning: inv_sympd(): given matrix is not symmetric Tau: [1] 1 0 Fixed-effect coefficients: [,1] [1,] 3.7204731 [2,] 0.7463344 [3,] 0.7177603 [4,] 0.0678791 iGet_Coef: 2 iter from getPCG1ofSigmaAndVector 1 iter from getPCG1ofSigmaAndVector 1

warning: inv_sympd(): given matrix is not symmetric Tau: [1] 1 0 Fixed-effect coefficients: [,1] [1,] 3.72047400 [2,] 0.74633265 [3,] 0.71776235 [4,] 0.06787937

Family: binomial Link function: logit

[1] "theta" "coefficients" "linear.predictors" [4] "fitted.values" "Y" "residuals" [7] "cov" "converged" "sampleID" [10] "obj.noK" "obj.glm.null" "traitType" [13] "LOCO" 4 471746x1 2x1 iter from getPCG1ofSigmaAndVector 1 iter from getPCG1ofSigmaAndVector 1 i is 6956 iter from getPCG1ofSigmaAndVector 1 CV for variance ratio estimate using 30 markers is 2.002744e-09 < 1 varRatio 1 [1] 1 closed the plinkFile!

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/weizhouUMICH/SAIGE/issues/100?email_source=notifications&email_token=ACL52L3TYJ75AI676SKZIVTP4KNMZA5CNFSM4HYMB7JKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYRZS6Y#issuecomment-505649531, or mute the thread https://github.com/notifications/unsubscribe-auth/ACL52LZIE7EN67QM4CTZQXTP4KNMZANCNFSM4HYMB7JA .

weizhouUMICH commented 4 years ago

I’m closing this issue because it has been inactive for a few months. Please reopen if you still encounter this issue with the latest version 0.36.3.1 Thank you! Wei