Closed milancurcic closed 1 year ago
Hi! What do you think about abstract class-based activation function implementation?
We can define abstract class containing only deferred function eval
type, abstract :: activation_function_t
contains
procedure(eval_i), deferred :: eval
end type activation_function_t
abstract interface
pure function eval_i(this, x) result(res)
import :: activation_function_t
class(activation_function_t), intent(in) :: this
real, intent(in) :: x(:)
real :: res(size(x))
end function eval_i
end interface
Then, by extending the `activation function_t' class, concrete activation functions can be defined, with function parameters simply being members of this new type:
type, extends(activation_function_t) :: elu_function_t
real :: alpha
contains
procedure :: eval => eval_elu
end type elu_function_t
contains
pure function eval_elu(this, x) result(res)
! Exponential Linear Unit (ELU) activation function.
class(elu_function_t), intent(in) :: this
real, intent(in) :: x(:)
real :: res(size(x))
where (x >= 0)
res = x
elsewhere
res = this%alpha * (exp(x) - 1)
end where
end function eval_elu
So we can initialize an instance of this type to pass it as an argument or to use it as a member of other types:
class(activation_function_t), allocatable :: activation_function
allocate( activation_function, source = elu_function_t( alpha = 0.3 ) )
Moreover, we can add eval_prime
procedure to the activation_function_t
, allowing to provide evaluation of function values and it derivatives using a single object.
Thanks @ggoyman, I believe inference-engine takes a similar approach.
In a nutshell, it seems to me that an abstract class approach allows the activation-specific parameters to be carried with the concrete activation type itself, rather than the layer type. I like that.
Would you be open to contributing this as a PR? I'd help.
Yes,that's the Inference-Engine approach. We call the abstract type activation_strategy_t because it's an example of the Strategy design pattern .
@milancurcic, OK, I'll try to implement this solution.
Solved by #126.
Some activation functions like leaky ReLU (#123) require one or more additional parameters.
To allow passing activation functions as procedure pointers, all functions must have the same interface. A proposed general solution (thank @jvdp1) is to:
activation_params
or similar that defines any possible extra parameters that may be needed by activation functions; set default values of activation parameters in the type definition.type(activation_params), intent(in) optional :: params
dummy argument (instead of the currentalpha
). Inside activation function definitions, function that use one or more activation parameters access them directly; those that don't simply ignore it.activation_params
an attribute ofdense
andconv2d
layers (and later any other layers that activate).