Description

This PR contains the new formulation by Lucas Alric for the mixture-based activation function, i.e. mixture- ReLU, Sigmoid and Tanh. The new formulations are simpler and most importantly, they remove the need to have omega_tol to avoid having numerical issues with /0.

Changes Made

I have completed the implementation and tested the new formulation for all the mixture-based activation functions in all four files: activation.cpp, activation_fun_cpu.cpp, activation_fun.cu, activation_cuda.cu
I have removed the omega_tol parameter.
I have found and corrected a bug in the existing MixtureSigmoid() where the ma = ma/2 should have been done as a separate step. The result is that the number of epochs in the LSTM example test_lstm.py needs to be reduced to avoid NaNs.
I have re-updated the unit tests that were all minimally changed by the new formulation.
I have tested with test.py all activation functions with CPU and GPU.

Note for Reviewers

You can test the new activation functions through either the test.py or test_lstm.py

The mathematical formulation implemented as well as the comparison with MC sampling are presented in the following file: mRELU_Goulet_2022.pdf

lhnguyen102 / cuTAGI

New formulation for mixture-based activation functions #58

Description

Changes Made

Note for Reviewers