mlcommons / training

Reference implementations of MLPerf™ training benchmarks
https://mlcommons.org/en/groups/training
Apache License 2.0
1.57k stars 548 forks source link

Improve memory usage of RNN-T encoder StackTime module #509

Open calvinmccarter-at-lightmatter opened 2 years ago

calvinmccarter-at-lightmatter commented 2 years ago

The previous implementation of the RNN-T encoder StackTime module is slow and memory-inefficient. I have made a PR with this same fix to the MLCommons-Inference repo: https://github.com/mlcommons/inference/pull/1015

github-actions[bot] commented 2 years ago

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

johntran-nv commented 1 year ago

@mwawrzos could you please review?