lhnguyen102 / cuTAGI

CUDA implementation of Tractable Approximate Gaussian Inference
MIT License
30 stars 9 forks source link

New formulation mixture-based activation function #59

Closed jamesgoulet closed 5 months ago

jamesgoulet commented 6 months ago

Description

This PR contains the new formulation by Lucas Alric for the mixture-based activation function, i.e. mixture- ReLU, Sigmoid and Tanh. The new formulations are simpler and most importantly, they remove the need to have omega_tol to avoid having numerical issues with /0.

Changes Made

Note for Reviewers

You can test the new activation functions through either the test.py or test_lstm.py

The mathematical formulation implemented as well as the comparison with MC sampling are presented in the following file: mRELU_Goulet_2022.pdf

Screenshot 2024-04-07 at 16 49 53
jamesgoulet commented 6 months ago

@lhnguyen102 sorry for the previous version... I had an issue with the cuda compiling.

jamesgoulet commented 6 months ago

@lhnguyen102 I do not know why, initially the Build/ test failed... and the PyPI skipped. On my side when I run the test locally, they pass on cpu and gpu.

lhnguyen102 commented 6 months ago

@jamesgoulet PyPI pipeline only runs when the release is triggered. In the case of PRs, we won't want to trigger the release so it is all good. For the test, it is passed. I did not see any issue. I will review this PR ce weekend