outbrain / fwumious_wabbit

Fwumious Wabbit, fast on-line machine learning toolkit written in Rust
Other
133 stars 19 forks source link

Activation functions #85

Open yonatankarni opened 1 year ago

yonatankarni commented 1 year ago

as implemented in "deep" branch, the deep layers can use either RELU activation, or none (no activation function...). conveniently, the activation function type ("relu"/"none") is already governed by a command line argument, for instance for a 3rd layer with width 25 and RELU activation we add the command line args: --nn 2:width:025 --nn 2:activation:relu

in this PR I add additional activation functions for the deep layers, which can be controlled in the same manner: "leaky_relu", "tanh", "sigmoid".

now that we have 4 activation functions, it seems to me we can do better in terms of code re-use / eliminating repetitions between them, but not sure which approach to take so if you have concrete suggestions this is a good time and place to bring them up.

SkBlaz commented 1 year ago

@yonatankarni could you add a short description of what the contents of this PR are (i.e. an overview of sorts, as there are many changes)?

@SkBlaz Done (not sure why I can't simply reply, had to edit the comment :| )

yonatankarni commented 1 year ago

@yonatankarni the deep branch was already merged to main, so please open a PR that merges straight to main.

@adischw thanks for the heads up, (not sure why I can't simply reply, had to edit the comment :| ) no need to open a new PR - I simply updated this one by changing the base and force-pushing the updated revision

done (replying again, this time hopefully the right way)

SkBlaz commented 1 year ago

@yonatankarni it seems there is an issue with /FW/src/block_relu.rs (not sure if you meant we have a look after re-opening, if not please ignore this)

yonatankarni commented 1 year ago

@yonatankarni it seems there is an issue with /FW/src/block_relu.rs (not sure if you meant we have a look after re-opening, if not please ignore this)

@SkBlaz yes, this is due to a merge with new incoming changes from main, I will fix it shortly.