pitakakariki / simr

Power Analysis of Generalised Linear Mixed Models by Simulation
70 stars 19 forks source link

Errors in powerSim when passing a fit object whose formula was not explicit (i.e., with as.formula) #233

Open ericopolo opened 2 years ago

ericopolo commented 2 years ago

The title pretty much summarizes the problem, so I'm going to give a reproducible example below.

data("iris")
myFormula = as.formula("Sepal.Length ~ Sepal.Width + Petal.Length + Petal.Width")

iris.glm.explicit = glm(formula = Sepal.Length ~ Sepal.Width + Petal.Length + Petal.Width,data = iris)
iris.glm.asFormula = glm(formula = myFormula,data = iris)

# no Errors:
iris.glm.explicit.powerSim = simr::powerSim(fit = iris.glm.explicit,test = fixed(xname = "Sepal.Width",method = "t"))

# ERRORS:
iris.glm.asFormula.powerSim = simr::powerSim(fit = iris.glm.asFormula,test = fixed(xname = "Sepal.Width",method = "t"))

# printing errors:
head(iris.glm.asFormula.powerSim$errors)

Please ignore the fact that these analyses make no sense, I'm having the very same problem with my data and used "iris" just to facilitate reproducibility.

Thanks!

pitakakariki commented 2 years ago

Thanks for reporting this and for finding a workaround.

Are you able to complete your analyses using explicit formulas?

pitakakariki commented 2 years ago

Note to self: bug is in doFit.default. Need to create a complete new formula to insert into newCall, since newCall[[formula]] will contain a reference rather than an actual formula that can be edited.

ericopolo commented 2 years ago

Thanks for reporting this and for finding a workaround.

Are you able to complete your analyses using explicit formulas?

Hi, you're welcome! Yes, using explicit formulas everything works fine, but I'm having to do it "manually". The thing is I'm running multiple different tests using the same data.frame, and I was planning on generate formulas dynamically, moving through different columns (predictors), passing their names "as.formula" to the glm() function. Since it didn't work that way, I'm now generating "sub" data.frames inside my loop, containing only the variables I'm gonna use at that iteration and changing their names to fixed names that I'm passing to glm explicitly.

I really imagined it could have something to do with the glm's "call", because I noticed it was the only difference between gml objects generated with and without explicit formulas. I should have added that thought to the topic, but it just slipped my mind. Good thing you already figured it out!

pitakakariki commented 2 years ago

You might be able to do something similar by dynamically generating the entire call, e.g.

iris.glm <- eval(parse(text="glm(Sepal.Length ~ Sepal.Width + Petal.Length + Petal.Width, data = iris)"))

ericopolo commented 2 years ago

You might be able to do something similar by dynamically generating the entire call, e.g.

iris.glm <- eval(parse(text="glm(Sepal.Length ~ Sepal.Width + Petal.Length + Petal.Width, data = iris)"))

Great idea! Thank you very much! =D