Closed shoutOutYangJie closed 2 years ago
I want to reproduce your result using your code. I download datasets excluding CC12M. And I find the three losses vary dramaticly. I don't know whether normal or not?
The variation you observe between batches should be normal.
I want to reproduce your result using your code. I download datasets excluding CC12M. And I find the three losses vary dramaticly. I don't know whether normal or not?