davisking / dlib

A toolkit for making real world machine learning and data analysis applications in C++
http://dlib.net
Boost Software License 1.0
13.46k stars 3.37k forks source link

Add customizable dropout layer with compile-time rate specification #3000

Closed Cydral closed 1 month ago

Cydral commented 1 month ago

This PR introduces a new customizable dropout layer, dropout_custom_, which allows specifying the dropout rate at compile-time. This enhancement is particularly beneficial for deep neural networks with numerous layers, where manually setting different dropout rates for each layer can be cumbersome.

Key features and benefits:

  1. Compile-time dropout rate specification: Allows for clearer and more concise network definitions.
  2. Inherits from the existing dropout_ class: Maintains all functionality of the original dropout layer.
  3. Template-based implementation: Provides type-safety and potential performance benefits.
  4. Includes a pre-defined dropout_10 alias: Offers a convenient 10% dropout option for common use cases.
arrufat commented 1 month ago

This is not good enough?

visit_computational_layers(net, [](dropout_& l){ l = dropout_(0.1); });
Cydral commented 1 month ago

No, because it doesn't allow you to precisely identify the layer to be modified. In an LLM-type network (but this would also be true for a convolution-based network for specific image processing), it may be necessary to add layers whose outputs are filtered by a dropout of a different rate.

Cydral commented 1 month ago

However, if we don't want to break the interface and to make the layer more flexible, we can also add a template parameter to the dropout_ layer, keep the dropout instantiation with a default parameter of 0.5 and add a dropout_c (c for custom) to specify the rate when defining the layer.

davisking commented 1 month ago

Yeah this is cool the way it is. How about a different name for it though, maybe dropout_at_rate or dropout_rate maybe? IDK, what do you guys think?

Cydral commented 1 month ago

"dropout_rate" seems pretty good. I'll rename the new class accordingly if that's OK with you. Note: the update is normally already visible in the branch to be merged.

davisking commented 1 month ago

Thanks, this is great :D