Closed jamesgoulet closed 5 months ago
@lhnguyen102 sorry for the previous version... I had an issue with the cuda compiling.
@lhnguyen102 I do not know why, initially the Build/ test failed... and the PyPI skipped. On my side when I run the test locally, they pass on cpu and gpu.
@jamesgoulet PyPI pipeline only runs when the release is triggered. In the case of PRs, we won't want to trigger the release so it is all good. For the test, it is passed. I did not see any issue. I will review this PR ce weekend
Description
This PR contains the new formulation by Lucas Alric for the mixture-based activation function, i.e. mixture- ReLU, Sigmoid and Tanh. The new formulations are simpler and most importantly, they remove the need to have
omega_tol
to avoid having numerical issues with /0.Changes Made
activation.cpp
,activation_fun_cpu.cpp
,activation_fun.cu
,activation_cuda.cu
omega_tol
parameter.MixtureSigmoid()
where thema = ma/2
should have been done as a separate step. The result is that the number of epochs in the LSTM exampletest_lstm.py
needs to be reduced to avoid NaNs.test.py
all activation functions with CPU and GPU.Note for Reviewers
You can test the new activation functions through either the
test.py
ortest_lstm.py
The mathematical formulation implemented as well as the comparison with MC sampling are presented in the following file: mRELU_Goulet_2022.pdf