error in running "STAARpipeline_Null_Model.r"

xihaoli / STAARpipeline-Tutorial

The tutorial for performing single-/multi-trait association analysis of whole-genome/whole-exome sequencing (WGS/WES) studies using FAVORannotator, STAARpipeline and STAARpipelineSummary

GNU General Public License v3.0

21 stars 17 forks source link

error in running "STAARpipeline_Null_Model.r" #43

Closed Jerryliu0024 closed 4 months ago

Jerryliu0024 commented 7 months ago

Hi Xihao,a very impressive job!!but when I run the program to this point--“STAARpipeline_Null_Model.r”，I get this error Iteration 20 : Error in h(simpleError(msg, call)) : error in evaluating the argument 'x' in selecting a method for function 'chol2inv': error in evaluating the argument 'x' in selecting a method for function 't': leading principal minor of order 1 is not positive Calls: fit_nullmodel ... .Call -> .handleSimpleError -> h -> .handleSimpleError -> h In addition: Warning message: In .local(A, ...) : CHOLMOD warning 'not positive definite' at file '../Cholesky/t_cholmod_rowfac.c', line 430 Execution halted

When generating the GRM matrix, I input the bed file that I completed pruning on Plink myself, and then follow your process to get a sparse matrix,I don't know how to fix it

Jerryliu0024 commented 7 months ago

I doubt if my GRM matrix is causing this problem，so i make a new one by tools:gcta(I think users who asked questions before did this)，but i still get this error in Iteration 9 Iteration 9 : Error in h(simpleError(msg, call)) : error in evaluating the argument 'x' in selecting a method for function 'chol2inv': error in evaluating the argument 'x' in selecting a method for function 't': leading principal minor of order 2430 is not positive Calls: fit_nullmodel ... .Call -> .handleSimpleError -> h -> .handleSimpleError -> h In addition: Warning message: In .local(A, ...) : Cholmod warning 'not positive definite' at file ../Cholesky/t_cholmod_rowfac.c, line 430 Execution halted

Jerryliu0024 commented 7 months ago

Xihao,and i have an another question,Can phenotypes be logical? If so, what changes will the model make？Can we directly set phenoadj.norm to 0 or 1？

xihaoli commented 7 months ago

Hi @Jerryliu0024,

Thanks for reaching out. Let's address your questions one by one. Regarding your question on the null model, this post is most relevant. Can you check whether your GRM contains negative values? What are the minimum and maximum values of your generated GRM?

Best, Xihao

Jerryliu0024 commented 7 months ago

Hi Xihao,thanks for your reply!! the minimum values of your generated GRM is 0，and maximum one is 0.556215,my GRM doesn't contain negative values.However,when i set the group.var as NULL,the STAARpipeline_Null_Model_GENESIS.r run well and obj_nullmodel_GENESIS.Rdata Successfully generated(but the running time is very short and there are no iterations)，Is it related to my phenotype being a 0/1 variable? maybe it conflicts with group.var

xihaoli commented 7 months ago

Hi @Jerryliu0024,

Yes, if your phenotype is dichotomous (0/1), then the group variable in STAARpipeline_Null_Model.r and group.var variable in STAARpipeline_Null_Model_GENESIS.r should be left empty. Could you please try it again?

Best, Xihao

Jerryliu0024 commented 7 months ago

Hi Xihao,I think this is indeed feasible！However ,I have questions about some parameters. For example,what's the meaning of input array id from batch file (Harvard FAS RC cluster),and when i running STAARpipeline_Gene_Centric_Coding.r,It reminds meError in STAAR(Geno, obj_nullmodel, Anno.Int.PHRED.sub.category, rare_maf_cutoff = rare_maf_cutoff, : genotype is not a matrix!

xihaoli commented 7 months ago

Hi @Jerryliu0024,

Thanks for letting me know. It's great to hear you got the null model fitting step correctly. With your new questions,

(1) The example scripts in the tutorial indicate that you would need to submit a list of array jobs to perform the association analysis (and here we use Harvard FAS RC cluster as an example, which is based on slurm workload manager). Specifically, for STAARpipeline_Gene_Centric_Coding.r, you would need to submit a total of 379 jobs, where an integer between 1 and 379 would be passed to arrayid in each job. We also provide an accompanying shell script to facilitate this batch job submission process.

(2) I can confirm that this message is not an error, but rather indicates that the specific gene category is not a valid variant set (i.e. there is not enough variant annotated as plof for a specific gene in your dataset). You may ignore this message safely.

Hope this helps.

Best, Xihao