I fit a GAM model with spline defined by the s() with a k argument. When k is a number, find_predictors() works fine. When k is a variable (storing a number), find_predictors() erroneously returns that variable name as a predictor.
I anticipate that this might be pretty tricky to disambiguate using regular expressions. In this example, the correct answer is: x, z. We do not want to get w, as in the second example.
Is there a relatively clean way to fix this?
library(mgcv)
set.seed(123)
n = 500
xn = rep(c(1, 2, 3), n)
levels = sort(unique(xn))
labels = c("low", "med", "high")
x = factor(xn, levels = levels, labels = labels)
z = sample(c(1, 2, 3, 3, 4, 4, 5, 5, 6, 7, 7, 7, 7), size = length(x), replace = TRUE)
y.raw = xn * z
e = rnorm(length(x), sd = sd(y.raw))
y = y.raw + e
data = data.frame(x, y, z)
w = 3
m1 = try(gam(y ~ s(z, by = x, k = 3) + x, data = data), silent = TRUE)
m2 = try(gam(y ~ s(z, by = x, k = w) + x, data = data), silent = TRUE)
insight::find_predictors(m1)
# $conditional
# [1] "z" "x"
insight::find_predictors(m2)
# $conditional
# [1] "z" "x" "w"
I fit a GAM model with spline defined by the
s()
with ak
argument. Whenk
is a number,find_predictors()
works fine. Whenk
is a variable (storing a number),find_predictors()
erroneously returns that variable name as a predictor.I anticipate that this might be pretty tricky to disambiguate using regular expressions. In this example, the correct answer is:
x
,z
. We do not want to getw
, as in the second example.Is there a relatively clean way to fix this?
Initially reported by @urisohn here: https://github.com/vincentarelbundock/marginaleffects/issues/1031