A question about loss in the train loop

Hello, thanks for the nice work!

I have a question about the train loop:

https://github.com/NVlabs/I2SB/blob/1ffdfaaf05495ef883ece2c1fe991b3049f814cc/i2sb/runner.py#L163-L189

As I understand, the train loop uses the gradient accumulation technique, where the loss is supposed to be normalized. But it seems that the loss is not normalized (i.e., loss = loss / n_inner_loop) in the inner loop:

https://github.com/NVlabs/I2SB/blob/1ffdfaaf05495ef883ece2c1fe991b3049f814cc/i2sb/runner.py#L185-L186

So are there any reasons for not normalizing the loss?

NVlabs / I2SB

A question about loss in the train loop #9