weizhouUMICH / SAIGE

GNU Lesser General Public License v3.0
187 stars 72 forks source link

possible issue with categorical variables used as covariates? #435

Open brettva opened 11 months ago

brettva commented 11 months ago

I am trying to run the following step1 using SAIGE_1.2.0:

fitNULLGLMM(plinkFile = plinkFile,
            phenoFile = phenoFile,
            phenoCol = phenotype,
            traitType = traitType,
            invNormalize = invNormalize),
            covarColList = covarCols,
            sampleIDColinphenoFile = 'IID',
            nThreads = as.integer(cpus),
            outputPrefix = results_dir,
            IsOverwriteVarianceRatioFile=TRUE)

where covarCols is

c("array","age","PC1_EUR","PC2_EUR","PC3_EUR","PC4_EUR","SNPSEX")

the array col is a categorical variable with string values "MGI", "MGI_1_1", "MGI_1_2","MGI_1_3"

When running step 1, I can see these are interpreted as levels -1 with columns with arrayMGI_1_1, arrayMGI_1_2, arrayMGI_1_3, buy I get error :

Error in model.frame.default(object, data, xlev = xlev) :
  invalid type (closure) for variable 'array'

This seems to be specific to the the categorical variable , if I recode the array column numerically the issue goes away. I have tried recoding other columns as categorical and see the issue appear there as well. Am I doing something silly on my end or is this a bug? Thanks!