Closed Cydral closed 1 month ago
This is not good enough?
visit_computational_layers(net, [](dropout_& l){ l = dropout_(0.1); });
No, because it doesn't allow you to precisely identify the layer to be modified. In an LLM-type network (but this would also be true for a convolution-based network for specific image processing), it may be necessary to add layers whose outputs are filtered by a dropout of a different rate.
However, if we don't want to break the interface and to make the layer more flexible, we can also add a template parameter to the dropout_ layer, keep the dropout instantiation with a default parameter of 0.5 and add a dropout_c (c for custom) to specify the rate when defining the layer.
Yeah this is cool the way it is. How about a different name for it though, maybe dropout_at_rate
or dropout_rate
maybe? IDK, what do you guys think?
"dropout_rate" seems pretty good. I'll rename the new class accordingly if that's OK with you. Note: the update is normally already visible in the branch to be merged.
Thanks, this is great :D
This PR introduces a new customizable dropout layer,
dropout_custom_
, which allows specifying the dropout rate at compile-time. This enhancement is particularly beneficial for deep neural networks with numerous layers, where manually setting different dropout rates for each layer can be cumbersome.Key features and benefits:
dropout_
class: Maintains all functionality of the original dropout layer.dropout_10
alias: Offers a convenient 10% dropout option for common use cases.