Open christophergandrud opened 7 years ago
I can't replicate:
library(Zelig)
z5 <- zls$new()
z5$zelig(Fertility ~ Education, data = swiss)
z5$setrange(Education = 5:10)
z5$sim()
ci.plot(z5)
> z5$setx.out
$range
$range[[1]]
Source: local data frame [1 x 2]
Groups: <by row>
# A tibble: 1 × 2
by mm
* <dbl> <list>
1 1 <dbl [1 × 2]>
$range[[2]]
Source: local data frame [1 x 2]
Groups: <by row>
# A tibble: 1 × 2
by mm
* <dbl> <list>
1 1 <dbl [1 × 2]>
$range[[3]]
Source: local data frame [1 x 2]
Groups: <by row>
# A tibble: 1 × 2
by mm
* <dbl> <list>
1 1 <dbl [1 × 2]>
$range[[4]]
Source: local data frame [1 x 2]
Groups: <by row>
# A tibble: 1 × 2
by mm
* <dbl> <list>
1 1 <dbl [1 × 2]>
$range[[5]]
Source: local data frame [1 x 2]
Groups: <by row>
# A tibble: 1 × 2
by mm
* <dbl> <list>
1 1 <dbl [1 × 2]>
$range[[6]]
Source: local data frame [1 x 2]
Groups: <by row>
# A tibble: 1 × 2
by mm
* <dbl> <list>
1 1 <dbl [1 × 2]>
> sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: macOS Sierra 10.12.1
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] Zelig_5.0-13
loaded via a namespace (and not attached):
[1] Rcpp_0.12.9 nloptr_1.0.4 plyr_1.8.4 miscTools_0.6-22 tools_3.3.2
[6] lme4_1.1-13 MatchIt_2.4-21 jsonlite_1.2 tibble_1.2 nlme_3.1-128
[11] lattice_0.20-34 mgcv_1.8-16 Matrix_1.2-7.1 DBI_0.5-1 maxLik_1.3-4
[16] parallel_3.3.2 SparseM_1.74 coda_0.19-1 AER_1.2-5 dplyr_0.5.0
[21] MatrixModels_0.4-1 stats4_3.3.2 lmtest_0.9-34 grid_3.3.2 nnet_7.3-12
[26] R6_2.2.0 geepack_1.2-1 survival_2.40-1 VGAM_1.0-3 foreign_0.8-67
[31] Amelia_1.7.4 minqa_1.2.4 Formula_1.2-1 car_2.1-4 magrittr_1.5
[36] mcmc_0.9-4 MASS_7.3-45 splines_3.3.2 assertthat_0.1 pbkrtest_0.4-6
[41] quantreg_5.29 sandwich_2.3-4 survey_3.31-5 MCMCpack_1.3-8 lazyeval_0.2.0
[46] zoo_1.7-14
Ah, I see that object needs to go into the ci.plot
function.
I was confused because the Zelig5 examples use the graph
method rather than the ci.plot
function.
I'm wondering if if this could be improved from a UX perspective in two ways:
setrange
is not really documented and would be non-obvious for Zelig 4 users. More documentation would be an easy update.
setx
method work like the setx
wrapper (I'll look into this, but if anyone knows off the top of their head it would speed things up). Make graph
informatively fail if used with a setrange
(right now nothing happens).
ci.plot
functionality into graph
or creating a ci.plot
method? Added a Zelig5 setrange
/ci.plot
example to the README: 0b719a9bc4fd3fda85a5537091323b117806e909
The graph
method now can tell if a range of x have been provided and call ci.plot
: 77ebffc7bd0f76e297527c6e1514bb808d1135d6
setrange
now.Here are my thoughts about this. They may not be coherent, but perhaps useful nonetheless.
There is as fundamental tension between setting a range of x values and providing first differences. If we use setx
and setx1
then first differences make sense. If we specify a range of x values than first differences really don't make sense. Providing first differences is one of the "selling points" of Zelig, so setx
and setx1
are emphasized.
Resolving this tension requires recognizing that we can't have our cake and eat it too. There are two different things that we want to do, and we need a clear distinction between them. I would argue that the only advantage of setx
/setx1
over setxrange
is that they allow for first differences. In all other cases setxrange
is the way to go. If we agree that this is true (it may not be, these are just my ramblings) then I propose that we a) get rid of setxrange
and setx1
, b) use setx
to mean what setxrange
currently means, and c) write a new function firstdiff
that does what setx/setx1
currently does.
I have more thoughts along these lines, but I'll stop here to see how this line of thought is received before proposing even more radical changes.
@izahn that is a pretty intriguing idea. Making the substantive uses of setx
and setrange
functionality more explicit could definitely help users.
I guess one big thing to think through is backwards compatibility and minimising software complexity.
Two points: [1] About method and wrapper calls.
setrange() isn't necessary when using the wrappers, and this is one place where the wrapper and method behaviour could be brought closer together. Here's a working code example using just setx wrapper:
[2] About Ista's point.
The range argument is most commonly used for showing predicted or expected values over a range of some variable of interest. The concept for this goes back all the way to figure 1 of King Tomz Wittenburg, and in my mind is pretty central to what Zelig should do. It didn't always work in Zelig 3.5 and when I first saw the Zelig 4 code had been removed.
While I agree that first differences are a great way to interpret models (especially nonlinear ones), predicted and expected values have their place. Without some adaptation of the range approach, one is left with tables of predicted and expected values at very static points in the space of the covariates. If I have a paper with a number of variables, but one of key interest, I'm probably going to tell illustrate a story of how things change as I move across my key variable. Knowing how to build such a graph yourself has always been a key assignment of any grad methods class I've taught, but it's a lot of leg work. Showing it can be done easily in Zelig is normally a moment when it clicks that Zelig can be useful for abstracting away a lot of the effort of model interpretation.
Anyway, in the way they have evolved in Zelig, setx has always been about giving predicted and expected values, and setx1 (which is a strange name, but seems stuck with us) has been the additional function if one also wanted first differences.
(Final point is, that it is also possible to use the ci.plot for creating first differences across two sets of ranges. What that means is very specific to how it is constructed, but is potentially a useful illustrative technique also.)
Completely agree making setx
method work like the wrapper. I'll work on implementing this.
I also find the setx1
terminology non-obvious. I guess it's been around for awhile, but maybe it is worth thinking about changing this.
The workflow for setx
with interactions also needs improvement.
For example, this runs fine:
states <- as.data.frame(state.x77)
z <- zelig(Murder ~ Income * Population, data = states, model = 'ls')
s1 <- setx(z, Population = 1000:2000, Income = 3098)
s2 <- setx(z, Population = 1000:2000, Income = 6315)
z <- sim(z, x = s1, x1 = s2)
z$graph()
But this doesn't:
states <- as.data.frame(state.x77)
z <- zelig(Murder ~ Income + Population, data = states, model = 'ls')
z <- z$setrange(Population = 1000:2000, Income = 3098)
z <- z$setrange1(Population = 1000:2000, Income = 6315)
## Error: attempt to apply non-function
More details:
states <- as.data.frame(state.x77)
z <- zelig(Murder ~ Income + Population, data = states, model = 'ls')
z <- z$setrange(Population = 1000:2000, Income = 3098)
summary(z)
## Length Class Mode
## 0 NULL NULL
Related to: https://github.com/IQSS/Zelig/issues/75
Before I go digging through the code, I was wondering if I'm missing something obvious.
I would like to
setx
for a range of values of a variable. This works fine using the Zelig4 wrappers, e.g.:I've tried to do something similar with:
but get the error:
and with
This doesn't return any errors, but the
summary
is:What am I missing here?