Closed christophergandrud closed 7 years ago
The main issue seems to come from here: https://github.com/IQSS/Zelig/blob/master/R/plots.R#L615-L619
if (is.null(var)) {
each.var <- apply(xmatrix,2,sd)
flag <- each.var > 0
min.var <- min(each.var[flag])
var.seq <- 1:ncol(xmatrix)
position <- var.seq[each.var==min.var]
}
Basically, the x-axis variable is chosen as being the one with fitted values with the lowest standard deviation.
Wouldn't it make more sense to choose the x-axis variable as the one with the number of unique fitted values equaling the number of scenarios? E.g. in the example above, Petal.Length
has 10 set values. This is also the number of scenarios.
Another odd thing I just noticed. Why does the fitted values for the factor levels change across scenarios? E.g.:
## setx_value Petal.Length Speciesversicolor Speciesvirginica qi_ci_min qi_ci_median qi_ci_max
## 1 x 1 0 0 0.07927395 0.1403430 0.2013641
## 2 x 2 1 0 0.64081014 0.8089075 0.9724328
## 3 x 3 1 0 0.93372904 1.0362871 1.1392268
## 4 x 4 0 0 0.65836258 0.8286814 1.0097060
## 5 x 5 0 1 1.83924381 1.9002084 1.9617810
## 6 x 6 1 0 1.59865187 1.7239358 1.8563640
## 7 x 7 0 0 1.14220600 1.5122383 1.9106976
## 8 x 8 1 0 1.93809134 2.1804360 2.4438411
## 9 x 9 0 0 1.46916870 1.9698641 2.5084195
## 10 x 10 0 0 1.63340341 2.1990871 2.8072804
The latter issue seems to be because Mode randomly chooses a value if there is no mode: https://github.com/IQSS/Zelig/blob/master/R/utils.R#L12-L13
Should we use a different heuristic? Maybe just the first value?
Related to this: maybe setx
should provide some messages about how it is choosing fitted values when they aren't provided by users. I know you can find this out with summary
, but an automatic message might be more intuitive.
If everyone is ok with these changes, I'll implement them today.
Issues resolved in 861482dfd66c55b2dc06efc1a90f19fc91dc5e97
E.g.:
library(Zelig)
z.out <- zelig(Petal.Width ~ Petal.Length + Species, data = iris,
model = "ls", cite = FALSE)
x.out1 <- setx(z.out, Petal.Length = 1:10)
summary(x.out1)
## range:
## (Intercept) Petal.Length Speciesversicolor Speciesvirginica
## 1 1 1 0 1
## 2 1 2 0 1
## 3 1 3 0 1
## 4 1 4 0 1
## 5 1 5 0 1
## 6 1 6 0 1
## 7 1 7 0 1
## 8 1 8 0 1
## 9 1 9 0 1
## 10 1 10 0 1
sims1 <- sim(z.out, x.out1)
plot(sims1)
Produces this plot:
The simulations are done correctly for the range of
Petal.Length
as we can see from: