Closed singmann closed 5 years ago
True -- emtrends
computes just linear trends. It's possible to get a test of the quadtratic trend like this:
> emm = emmeans(m1, ~ amf * hp, at = list(hp = -1:1))
> contrast(emm, "poly", by = "amf")
amf = 0:
contrast estimate SE df t.ratio p.value
linear -0.261818 0.146995 26 -1.781 0.0866
quadratic 0.000449 0.000454 26 0.990 0.3315
amf = 1:
contrast estimate SE df t.ratio p.value
linear -0.357789 0.097427 26 -3.672 0.0011
quadratic 0.000637 0.000253 26 2.513 0.0185
Hope that helps
Russ
Hmm, I am not sure I fully understand the scaling or unit of this output. It does not match the output above which seems to be on the original hp
scale, This can be seen by implementing this model by hand without poly
:
library("tidyverse")
mtcars <- mtcars %>%
mutate(hp_c = hp - mean(hp)) %>%
mutate(hp_c2 = hp_c^2)
m2 <- lm(mpg ~ amf*(hp_c + hp_c2), mtcars)
emtrends(m2, "amf", var = "hp_c")
# amf hp_c.trend SE df lower.CL upper.CL
# 0 -0.0651 0.0133 26 -0.0924 -0.0377
# 1 -0.0855 0.0142 26 -0.1146 -0.0564
#
# Confidence level used: 0.95
This matches exactly the output from emtrends
on m1
if an appropriately small value of delta.var
is chosen (also suggesting that the default value of delta.var
might be a tad too big in this case):
emtrends(m1, "amf", var = "hp", delta.var = 0.001)
# amf hp.trend SE df lower.CL upper.CL
# 0 -0.0651 0.0133 26 -0.0924 -0.0377
# 1 -0.0855 0.0142 26 -0.1146 -0.0564
#
# Confidence level used: 0.95
Likewise, the quadratic trends also do not match this new output.
emtrends(m2, "amf", var = "hp_c2")
# amf hp_c2.trend SE df lower.CL upper.CL
# 0 0.000224 0.000227 26 -0.000242 0.000691
# 1 0.000318 0.000127 26 0.000058 0.000579
#
# Confidence level used: 0.95
Note that I said, "It's possible to get a test of ..." -- never claimed it'd estimate them on the right scale. What you have is estimates of contrasts of the means, which you can see via:
> emmeans:::poly.emmc(1:3)
linear quadratic
1 -1 1
2 0 -2
3 1 1
If you scale the same contrasts differently, you get the results you just showed:
contrast(emm, list(lin = c(-1,0,1)/2, quad = c(1,-2,1)/2), by = "amf")
amf = 0:
contrast estimate SE df t.ratio p.value
lin -0.065045 0.013288 26 -4.895 <.0001
quad 0.000224 0.000227 26 0.990 0.3315
amf = 1:
contrast estimate SE df t.ratio p.value
lin -0.085512 0.014158 26 -6.040 <.0001
quad 0.000318 0.000127 26 2.513 0.0185
I'll consider what might be done to expand emtrends
beyond just linear trends, but it doesn't seem like the most compelling unmet need at this time.
Thanks for the clarification and the new code. This is exactly what I was looking for. So this issue could be closed.
But for what it is worth, there have been several times in the last year where I would have needed this feature. So either I am performing somewhat unusual analyses or other people who would like that feature are just not vocal about it. Anyway, easier access to this functionality would be much appreciated.
Henrik,
I'll leave it open, as this is something to work toward. May I ask...
degree
factor), or to obtain the different degrees in separate calls? Hi Russ,
Thanks for considering that further.
emtrends(m1, "amf", var = "hp", degree = 1)
and emtrends(m1, "amf", var = "hp", degree = 2)
seems like a great idea. The only issue I can see with this separation is if I want simultaneous p-value adjustment across both degrees.Following this thread with interest, as indeed this feature could be very useful.
The only issue I can see with this separation is if I want simultaneous p-value adjustment across both degrees.
Adding my two cents here; maybe the degree arg could accept a vector: emtrends(m1, "amf", var = "hp", degree = c(1, 2))
FWIW, I realized there's a straightforward way to get the scaling right for the quadratic trend: Just do a difference quotient on the slopes at two slightly different values of the covariate:
> emt <- emtrends(m1, ~ hp | amf, "hp", at = list(hp = c(99.999, 100)))
> contrast(emt, list(quad.trend = c(-1,1)*500))
amf = 0:
contrast estimate SE df t.ratio p.value
quad.trend 0.000224 0.000227 26 0.990 0.3315
amf = 1:
contrast estimate SE df t.ratio p.value
quad.trend 0.000318 0.000127 26 2.513 0.0185
OK, I have something that seems to work. More testing needed before I push it to github; you'll get a notice when I do.
> emtrends(m1, "amf", "hp", max.order = 2)
amf hp order hp.trend SE df lower.CL upper.CL
0 147 first -0.064987 0.013262 26 -9.22e-02 -0.037727
1 147 first -0.085430 0.014134 26 -1.14e-01 -0.056377
0 147 second 0.000224 0.000227 26 -2.42e-04 0.000691
1 147 second 0.000318 0.000127 26 5.79e-05 0.000579
Confidence level used: 0.95
I think I have this working correctly, and added a patch to make it work for multivariate models as well. So am closing this issue. But feel free to comment if anything comes up.
(PS -- I might change "order" to "degree"; my thinking for "order" was the order of the derivative being computed; but "degree" makes sense in terms of linear, quadratic, cubic trends.)
I know this is an old thread, I'll give it a go at reporting what seems to be a bug.
The above code works fine using poly
within an lm
framework, but fails in a gls
framework:
library(nlme)
mtcars$amf <- factor(mtcars$am)
m1 <- lm(mpg ~ amf*poly(hp, 2), mtcars)
m2 <- gls(mpg ~ amf*poly(hp, 2),
weights = varIdent(form = ~ 1|amf), data = mtcars)
> emtrends(m1, "amf", "hp", max.order = 2)
amf hp.trend SE df lower.CL upper.CL
0 -0.0650 0.0133 26 -0.0922 -0.0377
1 -0.0854 0.0141 26 -0.1145 -0.0564
> emtrends(m2, "amf", "hp", max.order = 2)
Error in poly(hp, 2) : 'degree' must be less than number of unique points
Any chance someone could help here?
Thanks
Thanks - I'll look at this when I can. Lots of other stuff going on right now... Russ
The difficulty is this. With m1
,
> terms(m1)
mpg ~ amf * poly(hp, 2)
attr(,"variables")
list(mpg, amf, poly(hp, 2))
attr(,"factors")
amf poly(hp, 2) amf:poly(hp, 2)
mpg 0 0 0
amf 1 0 1
poly(hp, 2) 0 1 1
attr(,"term.labels")
[1] "amf" "poly(hp, 2)" "amf:poly(hp, 2)"
attr(,"order")
[1] 1 1 2
attr(,"intercept")
[1] 1
attr(,"response")
[1] 1
attr(,".Environment")
<environment: R_GlobalEnv>
attr(,"predvars")
list(mpg, amf, poly(hp, 2, coefs = list(alpha = c(146.6875, 198.071514938048
), norm2 = c(1, 32, 145726.875, 977168457.086594))))
attr(,"dataClasses")
mpg amf poly(hp, 2)
"numeric" "factor" "nmatrix.2"
while with m2
,
> terms(m2)
mpg ~ amf * poly(hp, 2)
attr(,"variables")
list(mpg, amf, poly(hp, 2))
attr(,"factors")
amf poly(hp, 2) amf:poly(hp, 2)
mpg 0 0 0
amf 1 0 1
poly(hp, 2) 0 1 1
attr(,"term.labels")
[1] "amf" "poly(hp, 2)" "amf:poly(hp, 2)"
attr(,"order")
[1] 1 1 2
attr(,"intercept")
[1] 1
attr(,"response")
[1] 1
attr(,".Environment")
<environment: 0x000001b2108b2f50>
Note that the terms
component of m1
has an attr(,"predvars")
component that gives information from the Gram-Schmidt stuff needed to recreate the polynomial, and that is lacking in the terms for m2
. Without that, we can't do much, and in particular, the model.matrix
function doesn't even know that a polynomial was even fitted previously, which is why you get the error message.
The hopelessnessw of this is accentuated by the fact that nlme itself can't handle this model:
> newdata = ref_grid(m1)@grid # just a quick way to create a new dataset
> newdata
amf hp .wgt.
1 0 146.6875 19
2 1 146.6875 13
> predict(m1, new = newdata)
1 2
17.37105 20.49409
> predict(m2, new = newdata)
Error in poly(hp, 2) : 'degree' must be less than number of unique points
So I'm afraid that's [pretty much the end of the road, unless I put in some sort of messy hack that looks for poly()
terms and gets the needed special sauce from lm()
or somewhere.
PS - this is really a bug in nlme. I can report it, but my experience is that nobody ever responds to bug reports for that package.
Thank you so much for looking into this!
I hadn't noticed the bug with nlme
, as I was extracting fitted values only. This is worrying! I will definately report it, even if it falls on deaf ears...
I ended up trying something like this, but am unsure whether this is statistically correct.
> mtcars$hp_poly1 <- poly(mtcars$hp, 2)[, 1]
> mtcars$hp_poly2 <- poly(mtcars$hp, 2)[, 2]
> m3 <- gls(mpg ~ amf*(hp_poly1 + hp_poly2),
+ weights = varIdent(form = ~ 1|amf), data = mtcars)
> identical(predict(m2), predict(m3))
[1] TRUE
> newdata <- ref_grid(m3)@grid # just a quick way to create a new dataset
> newdata
amf hp_poly1 hp_poly2 .wgt.
1 0 0 7.024021e-18 30.16073
2 1 0 7.024021e-18 13.00000
> predict(m3, new = newdata) ## now able to predict
[1] 18.39336 21.94351
attr(,"label")
[1] "Predicted values"
> emtrends(m3, specs = "amf", var = "hp_poly1", mode = "df.error")
amf hp_poly1.trend SE df lower.CL upper.CL
0 -20.4 3.99 24 -28.7 -12.2
1 -26.4 4.91 24 -36.5 -16.3
Degrees-of-freedom method: df.error
Confidence level used: 0.95
> emtrends(m3, specs = "amf", var = "hp_poly2", mode = "df.error")
amf hp_poly2.trend SE df lower.CL upper.CL
0 7.02 5.63 24 -4.610 18.6
1 9.95 4.99 24 -0.354 20.3
Degrees-of-freedom method: df.error
Confidence level used: 0.95
I think this will NOT work because the increments in the difference quotients are different for the two degrees, and neither of them right w.r.t. hp. However, you can specify a model like amf*(hp + I(hp^2)) and that will be OK (though not as well-conditioned).
Thanks for your help.
The issue with fitting a model as amf*(hp + I(hp^2))
is that hp
and I(hp^2)
are not orthogonal (which is ensured by poly(..., raw = FALSE)
) -- maybe that is what you meant by "not as well conditioned".
I'll have to think about this more carefully.
Centering or nearly centering will improve the numerical conditioning a lot -- e.g., ~ amf*(I(hp - 150) + I((hp - 150)^2))
. But the really important thing is you actually have to have the variable hp
in both terms if you want the estimated trends to be correct.
R uses pretty high-quality code for fitting linear models, so numerical conditioning is not too big a deal. If you were fitting up to hp^4
or beyond, that'd be a different story.
This is great help, thanks!
@rvlenth, I'm reactivating this thread because our previous discussion (and subsequent stackoverflow post) resulted in nlme
maintainers fixing the nlme
bug (🎉), where attr(, "prevars")
was missing from a gls
model object's terms
.
However, it still seems like there's something amiss when I try to run emtrends
on a gls
with a polynomial term:
I'll post the same reprex as before:
> library(nlme)
> library(emmeans)
Warning message:
package ‘emmeans’ was built under R version 4.2.1
> mtcars$amf <- factor(mtcars$am)
> m1 <- lm(mpg ~ amf*poly(hp, 2), mtcars)
> m2 <- gls(mpg ~ amf*poly(hp, 2),
+ weights = varIdent(form = ~ 1|amf), data = mtcars)
> attr(terms(m1), "predvars")
list(mpg, amf, poly(hp, 2, coefs = list(alpha = c(146.6875, 198.071514938048
), norm2 = c(1, 32, 145726.875, 977168457.086594))))
> attr(terms(m2), "predvars")
list(mpg, amf, poly(hp, 2, coefs = list(alpha = c(146.6875, 198.071514938048
), norm2 = c(1, 32, 145726.875, 977168457.086594))))
> emtrends(m1, "amf", "hp", max.order = 2)
amf hp.trend SE df lower.CL upper.CL
0 -0.0650 0.0133 26 -0.0922 -0.0377
1 -0.0854 0.0141 26 -0.1145 -0.0564
Confidence level used: 0.95
> emtrends(m2, "amf", "hp", max.order = 2)
Error in poly(hp, 2) : 'degree' must be less than number of unique points
Here's my session info as well:
> sessionInfo()
R version 4.2.0 (2022-04-22 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)
Matrix products: default
locale:
[1] LC_COLLATE=English_Canada.utf8 LC_CTYPE=English_Canada.utf8 LC_MONETARY=English_Canada.utf8 LC_NUMERIC=C
[5] LC_TIME=English_Canada.utf8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] emmeans_1.8.1-1 nlme_3.1-157
loaded via a namespace (and not attached):
[1] compiler_4.2.0 estimability_1.4.1 tools_4.2.0 mvtnorm_1.1-3 grid_4.2.0 xtable_1.8-4
[7] lattice_0.20-45
Well, this puts me between a rock and a hard place. I can fix this, but it creates another error:
> emtrends(m2, "amf", "hp", max.order = 2)
Error in model.matrix.default(eval(call$model), data = data) :
model frame and formula mismatch in model.matrix()
Error: Can't estimate Satterthwaite parameters.
Try adding the argument 'mode = "appx-satterthwaite"'
If I do what it says...
> emtrends(m2, "amf", "hp", max.order = 2, mode = "appx")
Error: Can't estimate Satterthwaite parameters.
Try adding the argument 'mode = "df.error"'
If I do what that says,
> emtrends(m2, "amf", "hp", max.order = 2, mode = "df.error")
amf hp.trend SE df lower.CL upper.CL
0 -0.0650 0.0105 25 -0.0867 -0.0433
1 -0.0854 0.0178 25 -0.1221 -0.0487
Degrees-of-freedom method: df.error
Confidence level used: 0.95
So fixing this problem with poly
terms creates errors for all gls models when we use Satterthwaite or approximate Satterthwaite.
That is way too high a price, unless I can think of some other workaround.
I appreciate you taking the time to look at this again, @rvlenth (and I wish I could move the rock away ;) !). My knowledge of stats is at its limits wrt to this, am I to understand that the mode = "df.error"
solution is not resulting in good estimates?
This seems to be the best I can do so far...
> emtrends(m2, "amf", "hp", max.order = 2)
Note: Satterthwaite d.f. unavailable when model r.h.s. involves functions
amf hp.trend SE df lower.CL upper.CL
0 -0.0650 0.0105 25 -0.0867 -0.0433
1 -0.0854 0.0178 25 -0.1221 -0.0487
Degrees-of-freedom method: df.error
Confidence level used: 0.95
(That is, we force away from Satterthwaite methods.)
You may use a specified non-Satterthwaite mode, or specify the df manually, e.g.
> emtrends(m2, "amf", "hp", max.order = 2, df = 12)
amf hp.trend SE df lower.CL upper.CL
0 -0.0650 0.0105 12 -0.0879 -0.0420
1 -0.0854 0.0178 12 -0.1243 -0.0466
Degrees-of-freedom method: user-specified
Confidence level used: 0.95
There is nothing wrong with using mode = "df.error" as far as estimation is concerned. The estimates are not "bad"; we just may be off in estimating the associated degrees of freedom.
Hmmmm, well I am back to square one. It turns out that this is something emtrends()
-specific. After disabling the trap I just implemented, I do fine with orinary emmeans with Satterthwaite:
> emmeans(m2, ~amf|hp, at=list(hp=c(142,144)))
hp = 142:
amf emmean SE df lower.CL upper.CL
0 17.7 0.683 16 16.2 19.1
1 20.9 1.624 10 17.3 24.5
hp = 144:
amf emmean SE df lower.CL upper.CL
0 17.5 0.687 16 16.1 19.0
1 20.7 1.650 10 17.1 24.4
Degrees-of-freedom method: satterthwaite
Confidence level used: 0.95
So now I really don't see what the problem is.
By the way, the correct argument is max.degree
, not max.order
, and it doesn't make sense to use it if you don't include degree
as a variable of interest. With model m1
, we have
> emtrends(m1, ~ amf | degree, "hp", max.degree = 2)
degree = linear:
amf hp.trend SE df lower.CL upper.CL
0 -0.065114 0.013319 26 -9.25e-02 -0.037736
1 -0.085610 0.014188 26 -1.15e-01 -0.056446
degree = quadratic:
amf hp.trend SE df lower.CL upper.CL
0 0.000224 0.000227 26 -2.42e-04 0.000691
1 0.000318 0.000127 26 5.79e-05 0.000579
Confidence level used: 0.95
OK, maybe I have it now...
> emtrends(m2, ~ amf | degree, "hp", max.degree = 2)
degree = linear:
amf hp.trend SE df lower.CL upper.CL
0 -0.065114 0.01058 16 -8.75e-02 -0.042683
1 -0.085610 0.01789 10 -1.25e-01 -0.045745
degree = quadratic:
amf hp.trend SE df lower.CL upper.CL
0 0.000224 0.00018 16 -1.58e-04 0.000607
1 0.000318 0.00016 10 -3.76e-05 0.000674
Degrees-of-freedom method: satterthwaite
Confidence level used: 0.95
> emtrends(m2, ~ amf | degree, "hp", max.degree = 2, mode = "appx")
degree = linear:
amf hp.trend SE df lower.CL upper.CL
0 -0.065114 0.01058 15.89 -8.76e-02 -0.042671
1 -0.085610 0.01789 9.96 -1.25e-01 -0.045725
degree = quadratic:
amf hp.trend SE df lower.CL upper.CL
0 0.000224 0.00018 15.89 -1.58e-04 0.000607
1 0.000318 0.00016 9.96 -3.78e-05 0.000674
Degrees-of-freedom method: appx-satterthwaite
Confidence level used: 0.95
It was a very subtle issue with the way data were passed from emtrends()
that messed-up the Satterthwaite setup.
Wonderful! Thanks so much @rvlenth ! Glad that in the end the issue didn't require a huge change :)
When fitting a model with
poly()
term, one can still useemtrends
to get the trends of the variable. However, it seems this only returns the linear trend (i.e., polynomial of degree 1). I would like to get the trends for the higher degree polynomial terms. Example follows:This looks to me like the linear trend of
hp
. However, I would like to get the quadratic trends as well.PS: Sorry for submitting an unfinished issue at first.