dragonfly606 / MonoCD

[CVPR 2024] MonoCD: Monocular 3D Object Detection with Complementary Depths
MIT License
24 stars 7 forks source link

resnet50训练精度问题 #4

Closed ycdhqzhiai closed 3 months ago

ycdhqzhiai commented 4 months ago

@dragonfly606 在请教下大佬,我这边把backbone跟换为resnet50,训练完了之后发现精度最高的才7%左右,与dla的25%相差太远了,但是对比大佬的log,相差不大,这种一般

[2024-05-07 21:30:35,464] monocd.trainer INFO: eta: 0:03:52 iter: 23090 loss: 1.4502 2D_IoU: 0.9635 3D_IoU: 0.9105 depth_loss: 0.0391 
compensated_depth_loss: 0.7725 keypoint_depth_loss: 0.0287 hm_loss: 0.0197 bbox_loss: 0.0366 dims_loss: 0.0768 orien_loss: 0.0130 
horizon_hm_loss: 0.1171 offset_loss: 0.0729 trunc_offset_loss: 0.0000 corner_loss: 0.0257 keypoint_loss: 0.2313 
weighted_avg_depth_loss: 0.0169 depth_MAE: 0.0015 comp_cen_MAE: 0.2854 comp_02_MAE: 0.2872 comp_13_MAE: 0.2869 
center_MAE: 0.0054 02_MAE: 0.0058 13_MAE: 0.0057 lower_MAE: 0.0010 hard_MAE: 0.0015 soft_MAE: 0.0032 time: 14.2924 data: 12.8418 
lr: 0.00000300 
Car AP@0.70, 0.70, 0.70:
bbox AP:86.3072, 75.5689, 63.9821
bev  AP:13.2226, 10.8190, 9.5384
3d   AP:7.4138, 6.0539, 5.1338
aos  AP:86.17, 75.25, 63.43
Car AP@0.70, 0.50, 0.50:
bbox AP:86.3072, 75.5689, 63.9821
bev  AP:43.6660, 33.5703, 28.4923
3d   AP:37.8610, 28.5280, 23.8539
aos  AP:86.17, 75.25, 63.43
dragonfly606 commented 4 months ago

@ycdhqzhiai ,实际上在单目3d目标检测领域DLA34的使用会比resnet更加频繁,实际效果也会更好,能够兼顾性能和推理速度,不同backbone之间的对比结果你可以参考这篇论文Objects as Points。至于差距过大的问题可以多重复几次实验以排除偶然因素,如果仍然出现该问题的话可以尝试使用resnet101查看是否有改善