Add model indicator encodings

juliasilge commented 4 years ago

These parsnip-adjacent packages do need indicator encodings added to them, I believe. We could consider adding something in parsnip to set these as a default (probably "traditional"???) if there is nothing set in the adjacent package itself.

As of now, I have these indicator encodings set as:

library(discrim)
#> Loading required package: parsnip

c("naive_Bayes",
  "discrim_linear",
  "discrim_flexible",
  "discrim_regularized") %>%
  purrr::map_dfr(parsnip::get_encoding) %>%
  knitr::kable()

model	engine	mode	predictor_indicators
naive_Bayes	klaR	classification	none
naive_Bayes	naivebayes	classification	none
discrim_linear	MASS	classification	traditional
discrim_linear	mda	classification	traditional
discrim_flexible	earth	classification	traditional
discrim_regularized	klaR	classification	traditional

^{Created on 2020-06-29 by the reprex package (v0.3.0.9001)}

I don't believe naive Bayes needs indicator variables at all, based on how the underlying functions treat factors. However, this is a pretty big change from how discrim has treated these up until now, as evidenced by the tests. The tests use model.matrix() like this: https://github.com/tidymodels/discrim/blob/e306147232fe32351b1cc33ce248e031e6b8ccaa/tests/testthat/test-naive-Bayes.R#L24 Which makes traditional dummy variables. We would need to rewrite the tests for the naive Bayes models if we do in fact want to keep them as "none" for predictor indicators.

Thoughts @topepo?

juliasilge commented 4 years ago

This is ready to go with the new parsnip branch:

library(discrim)
#> Loading required package: parsnip

c("naive_Bayes",
  "discrim_linear",
  "discrim_flexible",
  "discrim_regularized") %>%
  purrr::map_dfr(parsnip::get_encoding) %>%
  knitr::kable()

model	engine	mode	predictor_indicators	compute_intercept	remove_intercept
naive_Bayes	klaR	classification	none	FALSE	FALSE
naive_Bayes	naivebayes	classification	none	FALSE	FALSE
discrim_linear	MASS	classification	traditional	TRUE	TRUE
discrim_linear	mda	classification	traditional	TRUE	TRUE
discrim_flexible	earth	classification	traditional	TRUE	TRUE
discrim_regularized	klaR	classification	traditional	TRUE	TRUE

^{Created on 2020-06-30 by the reprex package (v0.3.0.9001)}

github-actions[bot] commented 3 years ago

This pull request has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

tidymodels / discrim

Add model indicator encodings #10