Closed filick closed 1 year ago
:thinking: I have never tried on pytorch 2 before. Taking a look now
Hi @filick It seems that it's because pytorch 2 does not like it when lr gets to exactly zero. This could happen in our current warmup scheduler design.
It should be fixed now (sorry that i accidentally closed this issue). You can git pull
to update to the latest then git submodule update --init --recursive
to also update the submodule nr3d_lib
. Let me know if its fixed :)
Yes, my training looks good now.Thanks @ventusff , you are fast!
Hi,
I tried to train the StreetSurf model on the Waymo-100613 scene but got a NaN loss at the first step. I just downloaded the processed data pack and tried several configs under code_single/configs/waymo/streetsurf/, nothing is modified expect file paths, but all experiments failed the same. I give some screanshots of logs below. I print the detailed loss dict, it seems the rgb loss and mask loss is NaN.
exp using nomask_withlidar.230814.yaml:
exp using withmask_withlidar.230814.yaml:
My environment is Pytorch 2.0.1, cuda 11.8.
Can you take a look? Thanks.