Closed Vincent131499 closed 2 years ago
Thanks for asking : -)
I trained the model for 40 epochs and got a BLEU score of 29.5 (on WMT'14 En-De). I didn't finish the training due to the high cost, so I don't know whether the performance could be better if trained longer (I feel probable not unless you train it for a really really long time).
More details would be released shortly (featuring a new plug-in-and-play Admin implementation), stay tuned!
It is mentioned in the article that 8 pieces of A100 are used to train the model. How long has it been trained and how many epochs have been reached? What is the specific performance/bleu of the final model?