dragonfly606 / MonoCD

[CVPR 2024] MonoCD: Monocular 3D Object Detection with Complementary Depths
MIT License
24 stars 7 forks source link

训练报错 #3

Closed ycdhqzhiai closed 4 months ago

ycdhqzhiai commented 5 months ago

1.torch.arange应该是在CPU上的,下面一行to(cuda)直接报错了 https://github.com/dragonfly606/MonoCD/blob/d3ec3455838fe2035b0babf90f779e0b44922fa7/model/head/detector_loss.py#L161 2.batch为4,训练过程中N=0 https://github.com/dragonfly606/MonoCD/blob/d3ec3455838fe2035b0babf90f779e0b44922fa7/model/anno_encoder.py#L120

不知道是什么原因,难道是我版本不一致?torch1.13

ycdhqzhiai commented 5 months ago

(https://github.com/dragonfly606/MonoCD/blob/d3ec3455838fe2035b0babf90f779e0b44922fa7/data/datasets/kitti.py#L687) 这个地方有个问题,只有在box有效的时候,reg_mask才为true,如果某个batch内,所有样本都没有box,会导致reg_mask全部为0,这样在计算loss时候,就会出现上面N=0的情况, https://github.com/dragonfly606/MonoCD/issues/1 这个里面问题也应该是reg_mask出现全为0 batchsize越小,出现概率越大

dragonfly606 commented 5 months ago

@ycdhqzhiai , I'm late. Thank you very much for your attention to our work and your suggestions. You are right. I have added the following code to avoid training abort. https://github.com/dragonfly606/MonoCD/blob/d44ea9c664e932be6bc4c1e1de25daf5c583c0ed/model/head/detector_loss.py#L408-L410