weizhou0 / qtl

GNU General Public License v3.0
22 stars 2 forks source link

GRM question #18

Open liuxiaZzz opened 2 months ago

liuxiaZzz commented 2 months ago

Hi, I applied this tool to a big public dataset and successfully got outcomes among several cell types. My successful step1 command is like Rscript /home/users/scratch/qtl/extdata/step1_fitNULLGLMM_qtl.R \ --useSparseGRMtoFitNULL=FALSE \ --useGRMtoFitNULL=FALSE \ --phenoFile=/home/users/scratch/eqtl/input/${cell_type}/${cell_type}.txt \ --phenoCol=$gene_id \ --covarColList=Age,Sex \ --sampleCovarColList=Age,Sex \ --sampleIDColinphenoFile=Sample_ID \ --traitType=count \ --outputPrefix=/home/users/scratch/eqtl/step1-3/cis/step1/${cell_type}/output/${celltype}$gene_id \ --skipVarianceRatioEstimation=FALSE \ --isRemoveZerosinPheno=FALSE \ --isCovariateOffset=FALSE \ --isCovariateTransform=TRUE \ --skipModelFitting=FALSE \ --tol=0.00001 \ --plinkFile=/home/users/scratch/eqtl/input/${celltype}/chr/ATGC${cell_type}_Asian_sle \ --nThreads=1 \ --IsOverwriteVarianceRatioFile=TRUE

But when I tried to use GRM to fit the null model in step1 with command like this, Rscript /home/users/scratch/qtl/extdata/step1_fitNULLGLMM_qtl.R \ --useGRMtoFitNULL=TRUE \ --phenoFile=/home/users/scratch/eqtl/input/${cell_type}/${cell_type}.txt \ --phenoCol=$gene_id \ --covarColList=Age,Sex \ --sampleCovarColList=Age,Sex \ --sampleIDColinphenoFile=Sample_ID \ --traitType=count \ --outputPrefix=/home/users/scratch/eqtl/step1-3/cis/step1/${cell_type}/output/${celltype}$gene_id \ --skipVarianceRatioEstimation=FALSE \ --isRemoveZerosinPheno=FALSE \ --isCovariateOffset=FALSE \ --isCovariateTransform=TRUE \ --skipModelFitting=FALSE \ --tol=0.00001 \ --plinkFile=/home/users/scratch/eqtl/input/${celltype}/chr/ATGC${cell_type}_Asian_sle \ --nThreads=1 \ --IsOverwriteVarianceRatioFile=TRUE

I got a warning like "WARNING: Genetic variants needs to be ordered by chromosome and position in the Plink file" and error at the end like

[1] "Genotype reading is done" WARNING: Genetic variants needs to be ordered by chromosome and position in the Plink file chromosomeStartIndexVec: NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA chromosomeEndIndexVec: NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA WARNING: The number of autosomal chromosomes is less than 2 and leave-one-chromosome-out can't be conducted! startIndex_vec -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 tauInit[1] 0 0 fixtau 1 0 0 tauInit 1 0

I also tried the example of step1 in your document but changed --useGRMtoFitNULL=TRUE Rscript step1_fitNULLGLMM_qtl.R \ --useSparseGRMtoFitNULL=FALSE \ --useGRMtoFitNULL=TRUE \ --phenoFile=./input/seed_1_100_nindep_100_ncell_100_lambda_2_tauIntraSample_0.5_Poisson.txt \ --phenoCol=gene_1 \ --covarColList=X1,X2,pf1,pf2 \ --sampleCovarColList=X1,X2 \ --sampleIDColinphenoFile=IND_ID \ --traitType=count \ --outputPrefix=./output/nindep_100_ncell_100_lambda_2_tauIntraSample_0.5_gene_1 \ --skipVarianceRatioEstimation=FALSE \ --isRemoveZerosinPheno=FALSE \ --isCovariateOffset=FALSE \ --isCovariateTransform=TRUE \ --skipModelFitting=FALSE \ --tol=0.00001 \ --plinkFile=./input/n.indep_100_n.cell_1_01.step1 \ --IsOverwriteVarianceRatioFile=TRUE

Similar warning information was output like

WARNING: Genetic variants needs to be ordered by chromosome and position in the Plink file chromosomeStartIndexVec: 0 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA chromosomeEndIndexVec: 768 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA

And similar error at the end

0 markers with MAF >= 0.01 and missing rate <= 0.15 time: 23.299 [1] "Genotype reading is done" tauInit[1] 0 0 fixtau 1 0 0 tauInit 1 0 Error in if (sum(tauInit[idxtau]) == 0) { : missing value where TRUE/FALSE needed Calls: fitNULLGLMM_multiV -> system.time -> glmmkin.ai_PCG_Rcpp_multiV Timing stopped at: 0.239 0.001 0.257 Execution halted

In summary, I am a little confused about the use of GRM. Can SAIGE-QTL compute GRM automatically? When should I use GRM? If I want to call eQTLs from a specific cell type, how should I compute the pre-compute GRM?

Thanks for your patience and look forward to your reply!

liuxiaZzz commented 2 months ago

Update a solution. I created a new environment to install SAIGE-GENE then use createSparseGRM.R to generate sparse GRM. After that, I return to the SAIGE-QTL and uncommented line 988 in SAIGE_fitGLMM_fast_multiV.R, which made the length of tauInit consistent with the length of fixtau. Now it runs well.