saigegit / SAIGE

Development for SAIGE and SAIGE-GENE(+)
GNU General Public License v3.0
64 stars 27 forks source link

Parsing errors when covariate names include symbols (gram+, gram-) #112

Open evatosco opened 1 year ago

evatosco commented 1 year ago

Hello!

After a few weeks of running SAIGE-GENE+, I've encountered a detail that might be of interest. I had two binary covariates called "Gram+" and "Gram-". When I tried to run SAIGE adjusted by either of those covariates, a parsing error comes up:

[...]
chromosomeStartIndexVec:  0 118 416 540 563 803 843 865 921 NA NA 1134 1447 1637 1677 NA 1717 1890 NA NA 2072 2171 
chromosomeEndIndexVec:  117 415 539 562 802 842 864 920 1133 NA NA 1446 1636 1676 1716 NA 1889 2071 NA NA 2170 2288 
827  samples have genotypes
Gram+ are categorical covariates
formula is  ARDS~Gram+ 
Error in parse(text = x, keep.source = FALSE) : 
  <text>:2:0: unexpected end of input
1: ARDS~Gram+
   ^
Calls: fitNULLGLMM ... formula -> formula.character -> formula -> eval -> parse
Execution halted
[...]

Even though it seemed obvious to me that the main problem was the presence of symbols + and - messing around in the formula, I thought I could mention this here so that if someone encounters the same situation, they would find out that the solution is to rename those variables everywhere in the input files and covar file. I turned them into Gram_pos and Gram_neg -- problem solved.

If the developers can eventually somehow delimit more clearly the names of the variables, for the R interpreter to read them as literal names instead of anything else, that would be cool too. But I don't think it would be a priority right now.

Hope it helps! Eva