lhnguyen102 / cuTAGI

CUDA implementation of Tractable Approximate Gaussian Inference
MIT License
24 stars 9 forks source link

Issue with LayerNorm & CUDA #60

Closed jamesgoulet closed 2 months ago

jamesgoulet commented 3 months ago

I experimented with your release_draft classification examples and I think there is an issue with the LayerNorm layer on CUDA.

In the test I ran, the cap updates seems to work fine across architecture for both CPU & GPU. The issue seems to originate from LayerNorm and GPU. Whether you try running FNN_LAYERNORM or CNN_LAYERNORM, both architectures work well on CPU regardless of the batch size. However, both fails on CUDA regardless of the batch size. This is what leads me to think that the issue is with LayerNorm & CUDA rather than with capping the updates.