Open Xabab opened 12 months ago
If you have nothing better to do than watching the loss graph, seeing current gradient accumulation step to know when current loss will be updated would be neat.
If you have nothing better to do than watching the loss graph, seeing current gradient accumulation step to know when current loss will be updated would be neat.