jianyangqt / gcta

GCTA software
GNU General Public License v3.0
73 stars 23 forks source link

Can't converge – hit the upper limit #44

Closed jhylwq123 closed 1 year ago

jhylwq123 commented 1 year ago

Thanks for developing this fast and nice tool! I am using fastGWA-mlm to perform GWAS association analysis, here is the log file

--bfile /public/home/jiahanying/project/dkd_gwas/ASA_heathy_control/stage2_ctr_impute/ctr_dn_impute/QC/ASA_hc_dn_imputated_4 --grm-sparse ASA_hc_dn_sp_grm --pheno pheo_dn.txt --covar c_covar_dn.txt --qcovar q_covar_dn.txt --thread-num 16 --fastGWA-mlm --out mlm_dn_hc_s2

The program will be running with up to 16 threads. Reading PLINK FAM file from [/public/home/jiahanying/project/dkd_gwas/ASA_heathy_control/stage2_ctr_impute/ctr_dn_impute/QC/ASA_hc_dn_imputated_4.fam]... 5263 individuals to be included from FAM file. Reading phenotype data from [pheo_dn.txt]... 5263 overlapping individuals with non-missing data to be included from the phenotype file. 5263 individuals to be included. 2363 males, 2900 females, 0 unknown. Reading PLINK BIM file from [/public/home/jiahanying/project/dkd_gwas/ASA_heathy_control/stage2_ctr_impute/ctr_dn_impute/QC/ASA_hc_dn_imputated_4.bim]... 5203691 SNPs to be included from BIM file(s). Reading quantitative covariates from [q_covar_dn.txt]. 2 covariates of 5263 samples to be included. Reading discrete covariates from [c_covar_dn.txt]. 1 covariates of 5263 samples to be included. 2 qcovar, 1 covar and 0 rcovar to be included. 5263 common individuals among the covariate files to be included. 5263 overlapping individuals with non-missing data to be included from the covariate file(s). Reading the sparse GRM file from [ASA_hc_dn_sp_grm]... After matching all the files, 5263 individuals to be included in the analysis. Estimating the genetic variance (Vg) by fastGWA-REML (grid search)... Iteration 1, step size: 0.00245827, logL: 2498.92. Vg: 0.245827, searching range: 0.243369 to 0.245827 Iteration 2, step size: 0.000163885, logL: 2498.92. Vg: 0.245827, searching range: 0.245663 to 0.245827 Iteration 3, step size: 1.09256e-05, logL: 2498.92. Vg: 0.245827, searching range: 0.245816 to 0.245827 Iteration 4, step size: 7.28376e-07, logL: 2498.92. Vg: 0.245827, searching range: 0.245826 to 0.245827 Iteration 5, step size: 4.85584e-08, logL: 2498.92. Vg: 0.245827, searching range: 0.245827 to 0.245827 Iteration 6, step size: 3.23723e-09, logL: 2498.92. Vg: 0.245827, searching range: 0.245827 to 0.245827 Iteration 7, step size: 2.15815e-10, logL: 2498.92. Vg: 0.245827, searching range: 0.245827 to 0.245827 Iteration 8, step size: 1.43877e-11, logL: 2498.92. Vg: 0.245827, searching range: 0.245827 to 0.245827 Iteration 9, step size: 9.59179e-13, logL: 2498.92. Vg: 0.245827, searching range: 0.245827 to 0.245827 Iteration 10, step size: 6.39451e-14, logL: 2498.92. Vg: 0.245827, searching range: 0.245827 to 0.245827 Iteration 11, step size: 4.26326e-15, logL: 2498.92. Vg: 0.245827, searching range: 0.245827 to 0.245827 Iteration 12, step size: 2.84957e-16, logL: 2498.92. Vg: 0.245827, searching range: 0.245827 to 0.245827 Iteration 13, step size: 3.70074e-17, logL: 2498.92. Vg: 0.245827, searching range: 0.245827 to 0.245827 Best guess Vg range: 0.245827016062601 to 0.245827016062601, Vp: 0.153641885039126 Error: fastGWA-REML can't converge – hit the upper limit! An error occurs, please check the options or data

Could you tell me what's the error about?

longmanz commented 1 year ago

Hi, Thank you for using GCTA-fastGWA.

Based on your log file, it seems that the phenotype you are analyzing has a relatively large “heritability (h2)”, which is even higher than the Variance of phenotypes. This is usually due to a small number of related individuals in your cohort, and the estimation of h2 is unstable.

Can you do a quick check on your sparse.grm by looking at its number of rows (e.g., in linux, " wc -l combine_hc_s2_sp_grm.grm.sp “) ? The row number should be equal to “sample_size + number_of_related_pairs” . It you find this number:

  1. very close to 5263 (which is your sample size), then it means that you do not have enough number of related pairs for fastGWA. If this is the case, I recommend you try GCTA-MLMa or -MLMe (https://yanglab.westlake.edu.cn/software/gcta/index.html#MLMA), which uses the full-dense GRM. These 2 methods do not suffer from this issue.

  2. extremely larger than 5263 (e.g., a few times larger than it), then it means that your sparse-GRM has some problems. This is usually because you mis-calculated the GRM by using (1). rare variants, or (2). individuals from multiple different ethnicities. You will need to re-do the quality control for your genotypes before calculating the GRM.

Let me know if you still have issues with your GWAS analysis. If so, I will re-open this issue.