Open yonatankarni opened 1 year ago
@yonatankarni could you add a short description of what the contents of this PR are (i.e. an overview of sorts, as there are many changes)?
@SkBlaz Done (not sure why I can't simply reply, had to edit the comment :| )
@yonatankarni the deep branch was already merged to main, so please open a PR that merges straight to main.
@adischw thanks for the heads up, (not sure why I can't simply reply, had to edit the comment :| ) no need to open a new PR - I simply updated this one by changing the base and force-pushing the updated revision
done (replying again, this time hopefully the right way)
@yonatankarni it seems there is an issue with /FW/src/block_relu.rs
(not sure if you meant we have a look after re-opening, if not please ignore this)
@yonatankarni it seems there is an issue with
/FW/src/block_relu.rs
(not sure if you meant we have a look after re-opening, if not please ignore this)
@SkBlaz yes, this is due to a merge with new incoming changes from main, I will fix it shortly.
as implemented in "deep" branch, the deep layers can use either RELU activation, or none (no activation function...). conveniently, the activation function type ("relu"/"none") is already governed by a command line argument, for instance for a 3rd layer with width 25 and RELU activation we add the command line args: --nn 2:width:025 --nn 2:activation:relu
in this PR I add additional activation functions for the deep layers, which can be controlled in the same manner: "leaky_relu", "tanh", "sigmoid".
now that we have 4 activation functions, it seems to me we can do better in terms of code re-use / eliminating repetitions between them, but not sure which approach to take so if you have concrete suggestions this is a good time and place to bring them up.