ryanleary / mlperf-rnnt-ref

Other
3 stars 1 forks source link

An evalutation WER is higher than a Training WER, even if train on eval DS #5

Closed mwawrzos closed 4 years ago

mwawrzos commented 4 years ago

An experiment: https://docs.google.com/spreadsheets/d/1q-pInubS69ZMxlMOd-D42W1DA-D6jKXmFr9uV-tsAro/edit#gid=0&range=15:15

Possible reasons:

  1. in the experiment, the evaluation pipeline had slight differences: a. train pipeline was filtering long sentences, while eval was not - may have a significant influence; b. train pipeline was using SpecAugment, while eval was not - should not influence eval;
  2. bug in eval.

Config used in the experiment: https://github.com/ryanleary/mlperf-rnnt-ref/blob/4082f086ec4834886cceb927dbb1454eca44c68d/configs/rnnt.toml

ryanleary commented 4 years ago

Has anyone tracked this down yet?

mwawrzos commented 4 years ago

I rerun the evaluation on the filtered dataset. WER dropped to 0.0