Open gavinsimpson opened 4 years ago
Hello! Are there any plans to incorporate this option into smooth_estimates() any time soon? That would be super useful...
@barbmuhling the draw()
method for smooth_estimates()
has this (at least in the development version here on GH). If you mean to modify the values returned by smooth_estimates()
, I think I won't be adding support for that as you can already do what you want using transform_fun()
and a bit of dplyr fu:
library("mgcv")
library("gratia")
library("dplyr")
dat <- data_sim("eg1", n = 400, dist = "normal", scale = 2, seed = 2)
m1 <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = dat, method = "REML")
my_const <- coef(m1)[1]
smooth_estimates(m1) |>
add_confint() |>
mutate(across(tidyselect::all_of(c("est", "lower_ci", "upper_ci")),
.fns = \(x) x + my_const)) |>
transform_fun(fun = \(x) x^2)
to apply any function you want to the estimate and credible interval. (This is what is used internally by the draw()
methods.)
I could add something to transform_fun()
to allow you to pass the constant in as well, to avoid the mutate()
step?
Ah, this will work perfectly, thanks so much!
Hello Gavin, I came across this thread looking for help plotting by factors on top of each other. Apologies for the clunky code, but would this be the appropriate method?
Please let me know if this should go on stack overflow instread!
library("mgcv")
library("gratia")
library("dplyr")
dat <- data_sim("eg4", seed = 42)
m1 <- gam(y ~ fac + s(x2, by = fac), data = dat, method = "REML")
my_const <- coef(m1)[1]
my_constFac2 <- coef(m1)[2]
my_constFac3 <- coef(m1)[3]
parametric_effects(m1)
s2fac1 = smooth_estimates(m1) |>
filter(smooth == "s(x2):fac1") |>
add_confint() |>
mutate(across(tidyselect::all_of(c("est", "lower_ci", "upper_ci")),
.fns = \(x) x + my_const))
s2fac2 = smooth_estimates(m1) |>
filter(smooth == "s(x2):fac2") |>
add_confint() |>
mutate(across(tidyselect::all_of(c("est", "lower_ci", "upper_ci")),
.fns = \(x) x + my_const + my_constFac2))
s2fac3 =smooth_estimates(m1) |>
filter(smooth == "s(x2):fac3") |>
add_confint() |>
mutate(across(tidyselect::all_of(c("est", "lower_ci", "upper_ci")),
.fns = \(x) x + my_const + my_constFac3))
ggplot() +
# Fac 1
geom_line(data = s2fac1, aes(x = x2, est), color = "red") +
geom_ribbon(data = s2fac1, aes(ymin = lower_ci, ymax = upper_ci, x = x2),
alpha = 0.3, fill = "red") +
# Fac 2
geom_line(data = s2fac2, aes(x = x2, est), color = "black") +
geom_ribbon(data = s2fac2, aes(ymin = lower_ci, ymax = upper_ci, x = x2),
alpha = 0.3, fill = "black") +
# Fac 3
geom_line(data = s2fac3, aes(x = x2, est), color = "blue") +
geom_ribbon(data = s2fac3, aes(ymin = lower_ci, ymax = upper_ci, x = x2),
alpha = 0.3, fill = "blue")
@tkiff That works, but only for simple examples like this with a single factor.
I think you're better off using data_slice()
to create the data for covariates that you want to evaluate the model at, including the values of other covariates in the model, and then use fitted_values()
to get the data you need to plot:
library("mgcv")
library("gratia")
library("dplyr")
library("ggplot2")
dat <- data_sim("eg4", seed = 42)
m1 <- gam(y ~ fac + s(x2, by = fac) + s(x0) + s(x1), data = dat, method = "REML")
ds <- data_slice(m1, x2 = evenly(x2), fac = evenly(fac))
fv <- fitted_values(m1, data = ds)
fv |>
ggplot(aes(x = x2, y = .fitted, group = fac)) +
geom_ribbon(aes(ymin = .lower_ci, ymax = .upper_ci, fill = fac), alpha = 0.2) +
geom_line(aes(colour = fac))
Notice that data_slice()
provides values for any covariates not specified but which are in the model. This forces you to be explicit about what you are conditioning on.
In this case, you can exclude the effects of the other smooths when generating fitted values
fv2 <- fitted_values(m1, data = ds, exclude = c("s(x0)", "s(x1)"))
fv2 |>
ggplot(aes(x = x2, y = .fitted, group = fac)) +
geom_ribbon(aes(ymin = .lower_ci, ymax = .upper_ci, fill = fac), alpha = 0.2) +
geom_line(aes(colour = fac))
There's no difference here because of how data_slice()
finds typical values for the unmentioned coavriates.
@tkiff
Please let me know if this should go on stack overflow instread!
If you have questions about gratia you could ask them on SO but better might be to use the Discussion here on GitHub: https://github.com/gavinsimpson/gratia/discussions
This is essentially complete except for the two stretch goals
Implement the equivalent of the
trans
andshift
arguments thatmgcv:::plot.gam()
has so that users can pass in a function (e.g. the inverse link function) and a constant (e.g. the intercept), to get plots on the response scale.Should these be arguments to the
evaluate_smooth()
methods or somethingdraw()
does internally?Either way, the confidence bands aren't going to include the uncertainty in the constant, soThis is done inshift
is purely to move the origin (y-axis) around.draw()
and the various methods.Want to have a simpler way for the user to express that they want the plots on the response scale. So need a new argument (
response_scale
?) that if set toTRUE
, automatically setconstant
andfun
to the correct values/functions such that we get values and intervals on the response scale. This will override anything passed toconstant
andfun
obviously.Should we include the overall uncertainty as we're bringing in the model constant here?
What about factor
by
smooths? These should really have the constant plus the parametric term for their level added as a constant.Checklist: