Open hfrick opened 6 months ago
Do you anticipate any downsides to, instead, just fully switching to registering kernlab::ksvm()
via its XY interface? Same question would go for coxnet_train()
as well, I guess.
I'm not familiar with kernlab so can't give a qualified answer on that right now :)
For glmnet/coxnet: 😬
There is a fundamental design clash wrt to stratification. glmnet expects the response to be stratified which would mean that we would not have stratification information available at prediction time with tidymodels. To get out of that, coxnet_train()
handles the translation of stratification.
having multiple interfaces might be nice once we have sparse tibble support. all sparsity should be done using _xy
, but a given model might perform better non-sparse using a formula interface
+1 on the sparsity comment - one example (of probably a few more?) is https://github.com/tidymodels/censored/issues/276
With the work I'm doing in https://github.com/tidymodels/parsnip/pull/1165 and https://github.com/tidymodels/parsnip/issues/1125. I can make it work, but having a dgCMatrix
as a interface would make some of the code more clear as I'm right now forced to do some changes other places to make things work
We currently only allow one interface of an engine, set by
set_fit()
. Some engines have multiple interfaces themselves but we don't leverage that. This SO post runs into troubles with the formula interface ofkernlab::ksvm()
which could be resolved by using the matrix interface of the kernlab function. The workflow does use the tidymodels matrix interface but eventually translates it to the formula interface of kernlab because that's how it's registered in parnsip.This single translation point from parsnip to engine is also a challenge for https://github.com/tidymodels/censored/issues/311
Created on 2024-04-23 with reprex v2.1.0