Closed lxtGH closed 4 years ago
The performance of the baseline method deteriorates if trained longer. Secondly, the baseline method uses a batch size of 10, whereas we use a batch size of 8.
@sud0301 Hi!why the baseline method uses a batch size of 10, and you use a batchsize of 8
@sud0301 Hi! Thanks for releasing your code. It seems your training setting is a little different from baseline. They use 20k iteration while you use 40k.