[BERT] fixes and gradient accumulation for TF1

mlcommons / training

Reference implementations of MLPerf™ training benchmarks

https://mlcommons.org/en/groups/training

Apache License 2.0

1.59k stars 553 forks source link

Closed sgpyc closed 3 years ago

sgpyc commented 3 years ago

Includes GA and patches to the TF1 model. Pending merge of README update (https://github.com/sgpyc/training/pull/1) from aarti-cerebras@

github-actions[bot] commented 3 years ago

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

johntran-nv commented 3 years ago

@sgpyc is this the most up to date code for BERT?