NVlabs / I2SB

Other
225 stars 20 forks source link

A question about loss in the train loop #9

Open xunmeibuyue opened 8 months ago

xunmeibuyue commented 8 months ago

Hello, thanks for the nice work!

I have a question about the train loop:

https://github.com/NVlabs/I2SB/blob/1ffdfaaf05495ef883ece2c1fe991b3049f814cc/i2sb/runner.py#L163-L189

As I understand, the train loop uses the gradient accumulation technique, where the loss is supposed to be normalized. But it seems that the loss is not normalized (i.e., loss = loss / n_inner_loop) in the inner loop:

https://github.com/NVlabs/I2SB/blob/1ffdfaaf05495ef883ece2c1fe991b3049f814cc/i2sb/runner.py#L185-L186

So are there any reasons for not normalizing the loss?

yuanzhi-zhu commented 7 months ago

+1