Closed jeffdaily closed 3 years ago
Ran the NV-BERT on 1 GPU and dumped the performance numbers
training_sequences_per_second values Without the PR run 1 - 21.918 run 2 - 21.681
With the PR run 1 - 21.671 run 2 - 21.553
So it seems this PR is not causing any performance regression.
This reverts commit bdd481d15da054bceecd1ea61fe9c45e148f71b6.