Closed hfrick closed 11 months ago
I tried to understand what should happen when parsnip does the translation from formula to matrix interface so I looked at a glmnet model without censored being involved.
While everything runs, it does add an intercept to the terms which I think isn't right. It does not happen when fitting the same model through parsnip without workflows (or even just pure glmnet).
It doesn't error at prediction time (like in the censored reprex) because .convert_form_to_xy()
removes only the intercept column it added itself, not the one added by worflows via mold().
So I'm going to park this while investigating further in workflows.
Note for coming back: coxnet_prepare_x()
may well still need to mimick parsnip:::prepare_data()
more closely.
library(parsnip)
library(workflows)
wflow_fit <-
workflow() %>%
add_formula(mpg ~ .) %>%
add_model(linear_reg(penalty = 0.1, engine = "glmnet"), formula = mpg ~ .) %>%
fit(data = mtcars)
# this has a coefficient for an intercept column
rownames(wflow_fit$fit$fit$fit$beta)
#> [1] "`(Intercept)`" "cyl" "disp" "hp"
#> [5] "drat" "wt" "qsec" "vs"
#> [9] "am" "gear" "carb"
# when we fit the same model through parsnip alone, it does not
parsnip_fit <- linear_reg(penalty = 0.1, engine = "glmnet") %>% fit(mpg ~ ., data = mtcars)
rownames(parsnip_fit$fit$beta)
#> [1] "cyl" "disp" "hp" "drat" "wt" "qsec" "vs" "am" "gear" "carb"
# nor does the pure glmnet fit
glmnet_fit <- glmnet::glmnet(x = as.matrix(mtcars[, -1]), y = mtcars$mpg)
rownames(glmnet_fit$beta)
#> [1] "cyl" "disp" "hp" "drat" "wt" "qsec" "vs" "am" "gear" "carb"
Created on 2023-11-24 with reprex v2.0.2
This gets solved by https://github.com/tidymodels/parsnip/pull/1033, I'll bump the version requirement for parsnip after that one is merged.
Follow-up re: Do we still need changes to coxnet_prepare_x()
? I don't think we need changes right now. parsnip::prepare_data()
does 3 things after to potential call to a conversion function:
Because new_data
passes through parsnip::prepare_data()
before it gets to censored:::coxnet_prepare_x()
, any intercept coming from workflows gets removed in parsnip. Thus no change needed here in censored.
Sparse matrices are their own issue: #276
coxnet_prepare_x()
already converts to a matrix for glmnet, no change needed.
An intercept coming from workflows gets removed in parsnip's prepare_data()
before we arrive at coxnet_prepare_x()
https://github.com/tidymodels/parsnip/pull/1033 is merged - I'm going to close this. I decided to not bum the version requirement because this is "simply" a bug in parsnip and doesn't have anything (directly) to do with censored so it would be "off-pattern" to bump censored's version requirement.
This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.
This might be a workflows issue but I suspect we need to tweak censored to make the glment engine play nicely with workflows.
Created on 2023-11-16 with reprex v2.0.2