Closed daniel-hui closed 1 year ago
Sorry my last post may have been a little early, but I am now able to run STAARpipeline_Null_Model_GENESIS.r
without error (only using individuals with both genetics and phenotypes for the sGRM, and I had to change the ID column in the phenotype file to "sample.id"). However, if you are aware of any workarounds to not have to make a separate sGRM for different sets of individuals it would be convenient/appreciated, thanks.
Hi Daniel,
No problem. I have two general comments for you to consider:
The common strategy is to generate a sparse GRM for all subjects in the study. Then, when you fit the null model for different phenotypes, it is OK to subset the sGRM to phenotype-specific sub-matrices. In this way, you only need to generate sGRM once.
You mentioned that you could run STAARpipeline_Null_Model_GENESIS.r
without error. You may now consider running STAARpipeline_Null_Model.r
for your null model fitting as these two scripts share the same statistical framework and you don't need to convert the null model object using genesis2staar_nullmodel.R
.
Best, Xihao
Hi Xihao,
Thanks for helping me out earlier. I had a couple questions/issues when running "Step 1: Fit STAAR null model":
STAARpipeline_Null_Model.r
, I get an error:The error seems to be in the line:
obj_nullmodel <- fit_nullmodel(BMI_IRNT~Age+AgeSq+Subject_Information.Sex..str+PC1+PC2,data=phenotype,kins=sgrm,use_sparse=TRUE,kins_cutoff=0.022,id="IID",family=gaussian(link="identity"),verbose=TRUE)
for the field 'id="IID"'. However "IID" (no quotes) is the column name that the sample IDs are in. I tried running the script without quotes for "IID" and it doesn't work either. Would you know what the issue is?
STAARpipeline_Null_Model_GENESIS.r
and it gave an error:I saw in another issue this was caused by the IDs being converted to something like "ID_ID" -- I tried the fix in the other issue but still had problem. However, I do have samples in the sparse GRM that are not in the phenotype file -- we will probably run ~20 phenotypes which will have different numbers of individuals with available phenotypes. I suppose it would be preferable to just make one sparse GRM for all phenotypes, but it may not be too much more effort to make a new sparse GRM for each phenotype. Do you have any recommendation here? Thanks.
Daniel