Pointcept / Pointcept

Pointcept: a codebase for point cloud perception research. Latest works: PTv3 (CVPR'24 Oral), PPT (CVPR'24), OA-CNNs (CVPR'24), MSC (CVPR'23)
MIT License
1.52k stars 163 forks source link

Train PPT but got Loss = nan #257

Open isunLt opened 4 months ago

isunLt commented 4 months ago

Thanks for sharing the code! We follow the readme to prepare the enviorment and run the command: sh scripts/train.sh -g 4 -d nuscenes -c semseg-ppt-v1m1-0-nu-sk-wa-spunet -n semseg-ppt-v1m1-0-nu-sk-wa-spunet However, the loss turn to Nan in the middle of the training. Could the authors kindly help me with this problem?

Gofinge commented 4 months ago

Try again or set amp False. It does occur from time to time. Maybe clip loss within a range can solve it but I have not try it.

isunLt commented 4 months ago

Try again or set amp False. It does occur from time to time. Maybe clip loss within a range can solve it but I have not try it.

Thank you for your timely responce. I have tried twice but all get loss = nan during the training. I will try to set amp False.

isunLt commented 2 months ago

Try again or set amp False. It does occur from time to time. Maybe clip loss within a range can solve it but I have not try it.

I have tried semseg-ppt-v1m2-0-nu-sk-wa-spunet to use the decoupled segmentation head to train the network. It successfully runs through the training procedure but fails during the testing procedure as shown in the Fig. Can you help me with this problem so that I can successfully run the testing procedure?
