Fix WMT variant targets

mlcommons / algorithmic-efficiency

MLCommons Algorithmic Efficiency is a benchmark and competition measuring neural network training speedups due to algorithmic improvements in both training algorithms and models.

Apache License 2.0

319 stars 60 forks source link

The internal target setting configurations had a bug where the variants all used post-layernorm. We reran the target setting procedure with corrected configurations. These updated targets are the minimum BLEU score values of:

the previously checked in targets
the targets as determined by the full target setting procedure on internal experiments with corrected configs.
the best eval metrics achieved with the GitHub code with the updated hparam point from the target setting procedure.

These updates result in slightly easier targets than previously checked in tarets.

mlcommons / algorithmic-efficiency

Fix WMT variant targets #756