Closed tsbaguley closed 3 years ago
Interesting. Can you provide code that produces this error, so we can investigate?
This is the code I was using. It is a made-up example but typical of what someone might want to do in a factorial survey/vignette experiment. The aim is to show how to select a useable number of trials to estimate main effects and 2way interactions efficiently. The first example runs but I think D is implausible (close to 100) and the second example generated D = Inf. We did have some issues with server apparently so its possible that's causing a problem. I'm going to try re-running now with more repeats.
large_universe_example = expand.grid(a = as.factor(c("male", "female", "nonbinary")), b = as.factor(c("student", "professor")), c = as.factor(c("single", "married", "divorced")), d = as.factor(c("18-25", "26-30", "31-35", "36-40", "41-45", "46-50", "51-55", "56-60", "61-65", "65+")), e = as.factor(c("unemployed", "employed", "retired")), f = as.factor(c("banter", "bullying", "harassment")), g = as.factor(c("morning", "afternoon", "evening", "night")), h = as.factor(c("High", "Med", "Low")) )
head(large_universe_example) tail(large_universe_example) dim(large_universe_example)
design_lue <- gen_design(candidateset = large_universe_example, model = ~., trials = 1200, parallel = FALSE, repeats = 20) design_lue
get_attribute(design_lue, "alias.matrix") get_attribute(design_lue, "correlation.matrix") get_optimality(design_lue)
optFederovC(design_lue, nTrials=1200)
design_lue2 <- gen_design(candidateset = large_universe_example, model = ~(a+b+c+d+e+f+g+h)^2, trials = 1200, parallel = FALSE, repeats=20) design_lue2
get_attribute(design_lue2, "alias.matrix") get_attribute(design_lue2, "correlation.matrix") get_optimality(design_lue2)
I can replicate. For the first model, I think the high D efficiency is not too surprising - you have 1200 runs to estimate a 24 term model. We shouldn't be getting Inf in the second case, though, I'll look more into that.
Yes - that makes sense. Thank you!
@tylermorganwall I just initiated a pull request to fix the infinity, just involved moving exp() from DOptimalityLog to the caller.
We might have another problem, though. Continuing with this example:
> eval_design(design_lue2)
Error in eval_design(design_lue2) :
Wrong number of anticipated coefficients
Sorry, I should have clarified, I'm getting that error because design_lue2 is a nonsensical design - it only has a single design point repeated 1200 times. It's strange because if I make a 400 run design for the same model, it works ok.
That makes sense - with small runs I can pass the design to AlgDesign and get the same D efficiency but something weird happens in the couple of large runs I've tried. It could matter because these sorts of designs do come in factorial survey experiments.
I think I fixed the issue now in my pull request. The issue is that genOptimalDesign() was getting Inf when it calculated the D-optimality of the initial (random) design, and then the search algorithm couldn't proceed.
Thank you - I wanted to include an example in a manuscript I'm preparing for submission. When I next time to work on the MS I'll work up a short example and see if it works. I wanted to use skpr because it is more user-friendly than other alternatives.
This was indeed fixed by @GeorgeMKhoury's commit, although this particular design will take a while to compute since it's such a massive model.
Yes - I'll probably scale down my example in the ms I'm preparing.
I'm using get_optimality to try and find the D-efficiency of a fraction from a large factorial universe. It is returning "Inf" and I'm not sure how to interpret that. I've increased repeats but that hasn't shifted anything.