tidymodels / embed

Extra recipes for predictor embeddings
https://embed.tidymodels.org
Other
142 stars 19 forks source link

Extend to general classification problems? #3

Closed alexpghayes closed 4 years ago

alexpghayes commented 6 years ago

For prediction problems with K classes, it seems like a reasonable generalization would be to create K - 1 new predictor columns of class probabilities.

In the unpooled case, nnet::multinom would be an option at the cost another dependency. Haven't actually played around with keras yet but might be able to get a dependency-free softmax that way. Some small amount of regularization may be necessary if I recall correctly?

In the partially pooled case, there's family = "categorical" in brms, or potentially K-1 binary fits from stan_glmer or glmer. In the latter case it'd probably be best to use K-1 binary fits for the unpooled case as well for consistency.

Haven't used this personally so would love to hear from someone in the know if this would actually be useful.

topepo commented 6 years ago

Multiclass was the next step. multinom would work but we do have tensorflow just sitting around so we should use that.

The api is a little more difficult. It's not really a generalized linear model (not a big deal) but we would need to modify step_embed's options command to work with both stan options and keras options. Again, not impossible but it is getting more complicated.

We can add the amount of L2 regularization in the options too.

topepo commented 4 years ago

Not happening right now. Please add a PR if interested

github-actions[bot] commented 3 years ago

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.