Open mattansb opened 1 week ago
Are there any plans for automatic df extraction from the model?
Isn't this sufficient?
avg_predictions(A, df = insight::get_df(A))
Unfortunately, the joint test features are currently very limited. I will look into this in the future, and probably delegate some of the computation to the ghlt
package, which already works well with marginaleffects
.
One tip: Whenever possible, avoid the hypotheses()
function and use the hypothesis
argument directly in the main function call.
For your interaction, I would say that this requires adopting a radically different point of view.
IMHO, the only people who can reasonably be expected to make sense of your code are extremely advanced emmeans
experts.
In contrast, the hyp
function I show doesn’t require specifying a weird w
contrast matrix, understanding the specific concept of “interaction” at play, or what “pairwise” means in this context.
The code below is obviously more verbose than emmeans
, but I would argue that it is also more transparent. Moreover, specifying contrasts by reference to the actual levels feels safer than building a contrast matrix that assumes a certain order in the data.
I guess what I’m saying is that emmeans
experts like you probably won’t see much value here. And honestly, they should probably keep using emmeans
if it works for them. But for some new users who want to make complex comparisons, I think it may easier to specify contrasts using a language —simple dplyr
or base R
transformations—that they already know.
Of course, YMMV!
library(afex)
library(emmeans)
library(marginaleffects)
data(md_12.1)
A <- aov_ez("id", "rt", md_12.1, within = c("angle", "noise"))
# emmeans
ems <- emmeans(A, ~ angle + noise)
w <- data.frame(
"None vs Some" = c(-2, 1, 1) / 2,
"High vs Med" = c(0, 1, -1)
)
contrast(ems, interaction = list(angle = w, noise = "pairwise"))
> angle_custom noise_pairwise estimate SE df t.ratio p.value
> None.vs.Some absent - present -162 16.2 9 -9.970 <.0001
> High.vs.Med absent - present 84 24.0 9 3.500 0.0067
# marginaleffects
hyp <- function(x) {
dplyr::reframe(x,
term = c("None vs Some", "High vs Med"),
estimate = c(estimate[angle == "X0"] - mean(estimate[angle != "X0"]),
estimate[angle == "X8"] - estimate[angle == "X4"]),
.by = "noise")[, 2:3] |>
dplyr::summarize(estimate = diff(estimate), .by = "term")
}
predictions(A,
newdata = datagrid(angle = unique, noise = unique),
hypothesis = hyp)
>
> Term Estimate Std. Error z Pr(>|z|) S 2.5 % 97.5 %
> None vs Some -162 16.2 -9.97 <0.001 75.4 -194 -130
> High vs Med 84 24.0 3.50 <0.001 11.1 37 131
>
> Type: response
> Columns: term, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high
Isn't this sufficient?
avg_predictions(A, df = insight::get_df(A))
Yes, for fixed effects models. rmANOVA of mixed effect models require something other than the residual df.
Whenever possible, avoid the
hypotheses()
function and use the hypothesis argument directly in the main function call.
I didn't see an option for joint test outside hypotheses()
Re:
extremely advanced
emmeans
experts
Thanks 😊
Also thanks for the example. I think I might be able to build a function generating function that can bridge the emmeans way and the marginal effects way to some extent. Would that be something you would consider adding?
I think I might be able to build a function generating function that can bridge the emmeans way and the marginal effects way to some extent. Would that be something you would consider adding?
That's really intriguing!
Yes, in principle, that's something that I'd consider. But fair warning: I'm annoyingly picky about user interface. One thing I don't want to do is replicate the emmeans
user interface. And another is that I want to keep the "spirit" of the user-interface consistent.
What does "spirit" mean? Not sure...
I guess what I'm saying is that it would be important to discuss the ultimate user interface before writing any code, because there's a likelihood I won't merge if we can't find an interface that makes sense and is transparent.
I'll start super ambitious and we can work our way back.
Broadly defining a contrast as any comparisons between groups of estimates, possibly conditionally. So we would need a function that can define these groups that are to be compared, how to compare them, and what other variable to condition on.
library(marginaleffects)
data("mtcars")
mtcars$cyl <- factor(mtcars$cyl)
mod <- lm(mpg ~ cyl * am, data = mtcars)
contr <- function(..., comparison = "difference", by = NULL, transform = identity, aggregate = mean) {
# magic goes here
}
# generates a function that can be fed to `hypothesis=`
hyp <- contr(
c(4, 8) %vs% 6,
4 %vs% 8,
comparison = "ratio",
by = "am"
)
avg_predictions(mod, variables = c("cyl", "am"),
hypothesis = hyp)
The %vs%
operand is a short hand which can be cool to implement instead of list(a,b)
to have a %vs% b
.
This definition excludes other types of contrasts, such as polynomial contrasts. These can be defined by passing a vector/matrix of contrast weights (in which case the comparison
argument is ignored).
Interesting.
I think it’s useful to spell something out explicitly. In your examples, there are two very different operations:
Philosophically, marginaleffects
wants to keep these two things separate:
by
argumenthypothesis
argumentConsider a simple example. First, we fit a model and make predictions for each row of the original data (no aggregation). Since mtcars
has 32 rows, we get 32 predictions:
library(marginaleffects)
library(tibble)
mod <- lm(mpg ~ factor(cyl) * am, data = mtcars)
predictions(mod) |> nrow()
> [1] 32
Let’s aggregate/average unit-level predictions for each cyl
value:
avg_predictions(mod, by = "cyl")
>
> cyl Estimate Std. Error z Pr(>|z|) S 2.5 % 97.5 %
> 6 19.7 1.146 17.2 <0.001 218.5 17.5 22.0
> 4 26.7 0.914 29.2 <0.001 618.7 24.9 28.5
> 8 15.1 0.810 18.6 <0.001 255.0 13.5 16.7
>
> Type: response
> Columns: cyl, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high
Now, say you want to “collapse” the 4/6 and 8 cyl
groups into two buckets. Fundamentally, this is just another aggregation/averaging operation, which we can do with the by
argument:
bydf = tribble(
~ cyl, ~ by,
4, "small",
6, "small",
8, "large"
)
avg_predictions(mod, by = bydf)
>
> By Estimate Std. Error z Pr(>|z|) S 2.5 % 97.5 %
> small 24.0 0.715 33.5 <0.001 816.9 22.6 25.4
> large 15.1 0.810 18.6 <0.001 255.0 13.5 16.7
>
> Type: response
> Columns: estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, by
As noted above, the philosophy is to do contrasts using the hypothesis
argument. So one idiomatic way to do what you wanted is:
avg_predictions(mod, by = bydf, hypothesis = ~ sequential)
>
> Hypothesis Estimate Std. Error z Pr(>|z|) S 2.5 % 97.5 %
> (large) - (small) -8.87 1.08 -8.21 <0.001 52.0 -11 -6.75
>
> Type: response
> Columns: hypothesis, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high
Finally, it is true that hypothesis
can do everything at once. It’s a neat part of the design that this argument accepts arbitrary functions.
But I think it’s useful to think about the operations separately, because any implementation of a new contrasting interface that goes solely through hypothesis
will still have to combine the two steps: “aggregation” and “contrasting”. And I think that one of the big strengths of marginaleffects
is that it clearly separates these conceptually distinct operations. This makes the user interface very composable and flexible.
What I want to avoid is having two completely parallel way to approach the same two-step problem.
This is interesting - for a single contrast that really gets us 90% there - we would just need a way to add interaction contrasts and conditional contrasts.
But - contrasts often appear as families of contrasts. E.g., pairwise, consecutive, orthogonal... Which the by=bydf
method (as far as I can tell) does not allow, right?
Which the
by=bydf
method (as far as I can tell) does not allow, right?
This feature would be trivial to add. For example, this one-liner PR passes almost all existing tests, and allows this:
https://github.com/vincentarelbundock/marginaleffects/pull/1235
> mod <- lm(mpg ~ factor(cyl) * am, data = mtcars)
> bydf = tribble(
+ ~ cyl, ~ by,
+ 4, "a",
+ 6, "a",
+ 8, "b",
+ 4, "x",
+ 6, "y",
+ 8, "y"
+ )
> avg_predictions(mod, by = bydf)
By Estimate Std. Error z Pr(>|z|) S 2.5 % 97.5 %
a 24.0 0.715 33.5 <0.001 816.9 22.6 25.4
x 26.7 0.914 29.2 <0.001 618.7 24.9 28.5
y 16.6 0.662 25.2 <0.001 461.6 15.4 17.9
b 15.1 0.810 18.6 <0.001 255.0 13.5 16.7
Type: response
Columns: estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, by
Yes, but that would require typing out that bydf
for each contrast in the family. And would be much more complex for interaction (crossed) contrasts that have multiple family-by-family weighting.
The docs do say that more complex aggregation -
...you can use the FUN argument of the
hypotheses()
function.
So I'm not sure that allowing for this overlap is inconsistent.
Yes, but that would require typing out that
bydf
for each contrast in the family. And would be much more complex for interaction (crossed) contrasts that have multiple family-by-family weighting.
I'd certainly be interested in more ergonomic ways to specify the by
argument.
BTW, I’m not sure if you were aware that we can use matrices in the hypothesis
argument directly.
I could easily imagine someone creating a utility function to leverage that. Of course, one may need access to the data frame itself to know the correct indices, so perhaps you’re right and that has to happen inside the custom hypothesis
function.
library(afex)
library(emmeans)
library(marginaleffects)
data(md_12.1)
A <- aov_ez("id", "rt", md_12.1, within = c("angle", "noise"))
h = matrix(c(
-1,1,.5,-.5,.5,-.5,
-1,2,.1,-.5,.8,-.5),
ncol = 2
)
colnames(h) <- c("None vs. Some", "Something else")
predictions(A,
hyp = h,
newdata = datagrid(angle = unique, noise = unique)
)
>
> Term Estimate Std. Error z Pr(>|z|) S 2.5 % 97.5 %
> None vs. Some -162 16.2 -9.97 <0.001 75.4 -194 -130
> Something else 284 38.1 7.47 <0.001 43.5 210 359
>
> Type: response
> Columns: term, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high
Here are some more thoughts:
comparisons(variables = list(term = <thing>))
? This would essentially solve both the conditional contrasts (by setting by=
) and the interaction contrasts (by setting cross=TRUE
).Really these is some overlap between the comparisons()
function and some of the functionality offered by hypothesis=
, maybe aligning those two can resolve this to some extent?
library(marginaleffects)
library(emmeans)
#> Welcome to emmeans.
#> Caution: You lose important information if you filter this package's results.
#> See '? untidy'
mtcars$cyl <- factor(mtcars$cyl)
mtcars$am <- factor(mtcars$am)
mod <- lm(mpg ~ cyl * am, mtcars)
(p <- avg_predictions(mod, variables = c("cyl", "am")))
#>
#> am cyl Estimate Std. Error z Pr(>|z|) S 2.5 % 97.5 %
#> 1 6 20.6 1.751 11.75 <0.001 103.4 17.1 24.0
#> 1 4 28.1 1.072 26.19 <0.001 499.7 26.0 30.2
#> 1 8 15.4 2.144 7.18 <0.001 40.4 11.2 19.6
#> 0 6 19.1 1.516 12.61 <0.001 118.8 16.2 22.1
#> 0 4 22.9 1.751 13.08 <0.001 127.5 19.5 26.3
#> 0 8 15.1 0.875 17.19 <0.001 217.7 13.3 16.8
#>
#> Type: response
#> Columns: cyl, am, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high
emmobj(coef(p), vcov(p), lapply(insight::get_datagrid(p), unique))
#> cyl am estimate SE df asymp.LCL asymp.UCL
#> 6 1 20.6 1.751 NA 17.1 24.0
#> 4 1 28.1 1.072 NA 26.0 30.2
#> 8 1 15.4 2.144 NA 11.2 19.6
#> 6 0 19.1 1.516 NA 16.2 22.1
#> 4 0 22.9 1.751 NA 19.5 26.3
#> 8 0 15.1 0.875 NA 13.3 16.8
#>
#> Confidence level used: 0.95
(This can be expanded to also support estimates created from Bayesian models.)
This could be a feature for power users. You would then have a few tiers of contrast complexity:
comparisons()
hypothesis=<matrix>
as.emmGrid()
but limited to differences contrasts
hypothesis=<function>
that really can do whatever - any type of contrast (ratio, lift, ...) with what ever transformation. Really this is the most advanced option, for super power users, as it allows any arbitrary anything.Can we support a function / matrix passed in comparisons(variables = list(term =
))?
That’s an interesting idea, but I don't think it is possible to implement without fundamental (and enormous) changes to the current code architecture. My guess is that your mental model for comparisons()
is something like:
predictions()
But behind the scenes, that's not at all what is going on. What comparisons()
does is create two entirely different data frames, with different values of the focal variable. Then, it calls predict()
on both data frames, and compares the two:
library(marginaleffects)
mod <- lm(mpg ~ hp + am, mtcars)
d1 <- data.frame(hp = 100, am = 0)
d2 <- data.frame(hp = 101, am = 1)
predict(mod, d2) - predict(mod, d1)
> 1
> 5.218198
comparisons(mod,
variables = c("hp", "am"),
cross = TRUE,
newdata = datagrid(hp = 100, am = 0))
>
> C: am C: hp hp am Estimate Std. Error z Pr(>|z|) S 2.5 % 97.5 %
> 1 - 0 +1 100 0 5.22 1.08 4.83 <0.001 19.4 3.1 7.34
>
> Type: response
> Columns: rowid, term, contrast_am, contrast_hp, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, hp, am, predicted_lo, predicted_hi, predicted, mpg
We are always doing a “counterfactual comparison” between two specific values for each focal predictor. Two specific values of am
: 0 and 1. Two specific values of hp
: 100 and 101.
We have to fill actual data frames with those specific values, and feed it to predict()
to get the output. We cannot insert 100 and 101
into cell, because predict()
won't give us a prediction.
comparisons()
is designed for G-computation, that is, to estimate the effect of changes between observed levels of focal predictors. It cannot do weighted contrasts between marginal means. For that, you really need to go through predictions()
and hypothesis()
An alternative would be to convert marginaleffects objects to emmgrids - this is surprisingly easy
This is super cool! I had no idea.
But what is the value proposition here? If someone is deep enough into contrasts to know those more advanced emmeans
commands, why wouldn’t they use emmeans
from the start? What are the new features that they couldn’t get before?
My mental model for comparisons()
is exactly what you wrote - I still think it could work, though not sure it would be worth the effort?
mtcars$cyl <- factor(mtcars$cyl)
mod <- glm(am ~ cyl + hp,
family = binomial(),
data = mtcars)
d1 <- data.frame(hp = 100:101, cyl = "4")
d2 <- data.frame(hp = 100:101, cyl = "6")
d3 <- data.frame(hp = 100:101, cyl = "8")
w <- c(-2, 1, 1)/2
cbind(predict(mod, d1, type = "response"),
predict(mod, d2, type = "response"),
predict(mod, d3, type = "response")) %*% w
#> [,1]
#> 1 -0.7293757
#> 2 -0.7304380
But what is the value proposition here?
Well, {marginaleffects} supports more models (afaik), and produces the estimates (adjusted predictions, AMEs...) using a completely different approach.
Aaah, I see, thanks for clarifying!
To summarize, here are the proposals I've seen in this thread, and my current assessment:
comparisons()
accepts 3 values and a function in `variablesThe idea of allowing more than two values in the counterfactual comparisons is interesting, but I'm afraid it is not very realistic. The two counterfactuals assumption is deeply embeded in the architecture design, and changing that would be a massive endeavour. Think: 100+ hours of work.
Unfortunately, I don't have the bandwidth for that, and I don't think anyone else currently knows the code base well-enough to jump into that.
(I'd love to get more contributors! But starting on smaller projects makes more sense.)
by=
ergonomicsMake it more ergonomic to collapse categories, without having to write data frames by hand.
I like this idea a lot and would love to see that. I don't currently have an idea for how to make this easier, but happy to consider alternatives to data frames. List of lists feel even more complicated, but maybe there's a solution somewhere.
I think that's a very common use-case, so it would be worth it to invest some time in finding a solution here.
hypothesis=
helper functionThat's your original proposal. hypothesis
accepts arbitrary functions, which is powerful but cumbersome. We could implement a helper function to make this easier.
Once upon a time, I tried to do exactly that. See here.
At first, this felt like a good idea, but I quickly hit a wall. There are so many challenges: (1) people's needs are infinite, so the helper needs to be super flexible; (2) automatically labelling contrasts is challengingl; (3) finding a simple but ultra-flexible syntax is hard.
In the end, all the helper functions I wrote had tons of weird arguments. Learning all those arguments and domain-specific syntax just felt more complicated than just defining a custom contrast with dplyr
or base R
functions.
So I gave up.
I'm not closed to the idea. But your contr()
already has 4 arguments, and nothing to handle labels, weights, or cross-contrasts. We'll need to at least double the number of arguments, or really constrain the possibilities.
Maybe we can make that work, but I worry that we'll end up with either (a) a helper with so many arguments that it isn't more convenient than just writing a custom function, or (b) some magic-looking function call that can't be understood by anyone who reads the code, unless they spent hours reading our documentation first.
To be clear, I'm not saying this is impossible. I was just unable to overcome the challenges when I tried.
emmeans
objects to marginaleffects
Frankly, I'm not yet convinced that there's that much added value here, because emmeans
heroes should probably just keep using it (I don't want to convince anyone to shift!).
But if it's that easy, then maybe we can document this in a vignette. There's not much point in exporting a function that is just a one-liner, but showing this solution to users in documentation might be helpful.
df
The current document solution for this is to use the df
argument:
library(marginaleffects)
mod <- lm(mpg ~ cyl * am, mtcars)
avg_predictions(mod, df = df.residual(mod))
I actually kind of like this. It makes clear exactly where the degrees of freedom come from. It does not require users to type a lot of characters.
I'm once again trying to switch to {marginaleffects}, but have found two cases that are buggy or still very hard to execute (sorry for mixing a bug report with a "discussion"):
1. Simple effects (conditional joint tests)
2. Interaction contrasts
Created on 2024-10-14 with reprex v2.1.1
3. Automatic dfs?
Currently all tests are, by default, wald-z tests, which can be counter-conservative (especially in small samples). Are there any plans for automatic df extraction from the model?