pqhieu / jsis3d

[CVPR'19] JSIS3D: Joint Semantic-Instance Segmentation of 3D Point Clouds
https://pqhieu.com/research/cvpr19/
MIT License
175 stars 36 forks source link

Nan Loss & NormBackward1 in discriminative.py #26

Closed czha5168 closed 2 years ago

czha5168 commented 3 years ago

Hi Quang-Hieu, Thanks for providing the code.

I have managed to adopt your code for training on S3DIS dataset successfully. However, when I shifted it to my own dataset, some Nan loss appears at the first few iterations. Sometimes, it produced nan loss after the correct loss value for 1~3 iterations (in 1st epoch), while sometimes the nan loss is produced directly at the first iteration (in 1st epoch). Different learning rates & num_points values are tested, with no luck. Then I used "torch.autograd.set_detect_anomaly(True)" to detect the abnormal gradient values in network, and received an error message below. I have spent some time working on this, but still not have a clue yet. Could you please give me some instructions/comments if possible? Thanks in advance!

Error Message

Differences between my dataset and S3DIS: maximum num_instances per sample: less than 10 num_classes: less than 30 point sampling: random sample 4096 or 8192 points (out of 10e4-10e6 points) for each sample (rather than splitting into blocks at first)