tidymodels / brulee

High-Level Modeling Functions with 'torch'
https://brulee.tidymodels.org/
Other
67 stars 7 forks source link

More activation functions #74

Closed topepo closed 10 months ago

topepo commented 10 months ago

Closes #69

@christophscheuch and @dfalbel add more to the list! For now I'm avoiding any significantly parameterized functions.

christophscheuch commented 10 months ago

Thanks for picking it up!

I just checked the torch activation functions and I think we could add all except for nn_multihead_attention and nn_threshold because they have default values.

Here is a complete list (excl. multihead_attention and threshold), the ones with ✔️ are already in the PR:

If supporting all these methods is too much (or too opaque for users because they don't see the defaults), then I'd only like to add softmin and softmax because they are frequently used (from my experience).

topepo commented 10 months ago

For now, I'm trying to avoid those with significant tuning parameters, meaning that we can include those whose defaults are pretty good. I could use some help making that determination. For example, would users really want to tune the upper and lower uniform bounds in nn_rrelu()? I'll eventually have a way to pass parameters in but not right now.

The softmax functions are questionable (from a programmatic point of view). They don't take the same primary argument that the others do (but maybe that's not a big deal).

christophscheuch commented 10 months ago

For now, I'm trying to avoid those with significant tuning parameters, meaning that we can include those whose defaults are pretty good. I could use some help making that determination. For example, would users really want to tune the upper and lower uniform bounds in nn_rrelu()? I'll eventually have a way to pass parameters in but not right now.

Frankly, I find it very hard to judge the defaults. After thinking about it again, I support restricting the activation functions to the ones that don't require any parameters (until eventually parameters can be passed). This strategy prevents brulee users from naively using parametrized activation functions and getting frustrated if they cannot change the parameters.

topepo commented 10 months ago

Agreed. I'll go with functions with non-learnable arguments.

github-actions[bot] commented 10 months ago

This pull request has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.