Open xunmeibuyue opened 8 months ago
Hello, thanks for the nice work!
I have a question about the train loop:
https://github.com/NVlabs/I2SB/blob/1ffdfaaf05495ef883ece2c1fe991b3049f814cc/i2sb/runner.py#L163-L189
As I understand, the train loop uses the gradient accumulation technique, where the loss is supposed to be normalized. But it seems that the loss is not normalized (i.e., loss = loss / n_inner_loop) in the inner loop:
loss = loss / n_inner_loop
https://github.com/NVlabs/I2SB/blob/1ffdfaaf05495ef883ece2c1fe991b3049f814cc/i2sb/runner.py#L185-L186
So are there any reasons for not normalizing the loss?
+1
Hello, thanks for the nice work!
I have a question about the train loop:
https://github.com/NVlabs/I2SB/blob/1ffdfaaf05495ef883ece2c1fe991b3049f814cc/i2sb/runner.py#L163-L189
As I understand, the train loop uses the gradient accumulation technique, where the loss is supposed to be normalized. But it seems that the loss is not normalized (i.e.,
loss = loss / n_inner_loop
) in the inner loop:https://github.com/NVlabs/I2SB/blob/1ffdfaaf05495ef883ece2c1fe991b3049f814cc/i2sb/runner.py#L185-L186
So are there any reasons for not normalizing the loss?