There was an unexpected change in the training results. Is this caused by the teacher-student network?

ZhaoL0 commented 2 years ago

Hi, @Vegeta2020 . Thanks for your open source work! I encountered some problems when evaluating the training results of SE-SSD:

The results of the training model at the 60-th epoch are as follows:

# checkpoint_path = "{Save_Dir}/epoch_60.pth"
Evaluation official_AP_11: car AP(Average Precision)@0.70, 0.70, 0.70:
bbox AP:98.61, 90.10, 89.62
bev  AP:90.58, 88.71, 88.15
3d   AP:90.17, 86.22, 79.24
aos  AP:98.57, 89.86, 89.20
car AP(Average Precision)@0.70, 0.50, 0.50:
bbox AP:98.61, 90.10, 89.62
bev  AP:98.64, 90.18, 89.74
3d   AP:98.60, 90.15, 89.70
aos  AP:98.57, 89.86, 89.20

Evaluation official_AP_40: car AP(Average Precision)@0.70, 0.70, 0.70:
bbox AP:99.53, 95.63, 93.21
bev  AP:96.69, 91.99, 89.69
3d   AP:93.77, 86.21, 83.61
aos  AP:99.49, 95.34, 92.72
car AP(Average Precision)@0.70, 0.50, 0.50:
bbox AP:99.53, 95.63, 93.21
bev  AP:99.54, 95.88, 95.50
3d   AP:99.52, 95.83, 93.34
aos  AP:99.49, 95.34, 92.72

The result is similar to the experimental results in the paper. However, when I evaluated the model for the 54th epoch, I got unsatisfactory results:

# checkpoint_path = "{Save_Dir}/epoch_54.pth"
Evaluation official_AP_11: car AP(Average Precision)@0.70, 0.70, 0.70:
bbox AP:98.58, 90.02, 89.51
bev  AP:90.58, 88.60, 88.07
3d   AP:90.03, 79.84, 79.01
aos  AP:98.47, 89.76, 89.02
car AP(Average Precision)@0.70, 0.50, 0.50:
bbox AP:98.58, 90.02, 89.51
bev  AP:98.60, 90.12, 89.67
3d   AP:98.56, 90.07, 89.62
aos  AP:98.47, 89.76, 89.02

Evaluation official_AP_40: car AP(Average Precision)@0.70, 0.70, 0.70:
bbox AP:99.51, 95.52, 93.10
bev  AP:96.65, 91.90, 89.58
3d   AP:93.58, 84.14, 81.40
aos  AP:99.41, 95.20, 92.54
car AP(Average Precision)@0.70, 0.50, 0.50:
bbox AP:99.51, 95.52, 93.10
bev  AP:99.52, 95.81, 95.42
3d   AP:99.50, 95.73, 93.25
aos  AP:99.41, 95.20, 92.54

After that, I evaluated the results of the 56th epoch (recall 11, AP is 86.07), and the 50th epoch (recall 11, AP is 80.07) accordingly. The detection results were mutated between these models. The accuracy did not change much for the first 50 epochs, however, during the last few epochs, the model accuracy changed abruptly to AP 86. Have you encountered this problem and What could be the reason for it?

Vegeta2020 commented 2 years ago

Hi @ZhaoL0, sorry for the late reply. I guess it is due to the sparse sampling recall points in AP calculation. Specifically, when the detection recall rate is less the critical point, e.g., 9/11 or 10/11, the AP11 calculation will discard the precision at this recall rate in the cumulative summation. On the other hand, when the recall rate reach this critical point, the precision at this recall rate would be counted in the AP calculation, so it makes a big difference as you see. For the two metrics, AP11 and AP40 take 11 and 40 recall points separately, which means the former one use much more sparse sampling points than the latter one, so you can see a larger change of AP11. To conclude, the highest detection recall rate oscillating around a sampling point causes this issue.

ZhaoL0 commented 2 years ago

Many thanks for your answer! Indeed, as you suggest, an inappropriately sparse sampling recall points can destroy the continuity of performance evaluation.

Eaphan commented 2 years ago

@ZhaoL0 I failed to reproduce the result. Can you please answer my questions? Do you use the default config in repo? How many gpus do you use? Do you use the pretrained model in CIA-SSD repo as your initial weights for training SE-SSD?

LuYujing-97 commented 2 years ago

Hi, @Vegeta2020 . Thanks for your open source work! I encountered some problems when evaluating the training results of SE-SSD:

The results of the training model at the 60-th epoch are as follows:
# checkpoint_path = "{Save_Dir}/epoch_60.pth"
Evaluation official_AP_11: car AP(Average Precision)@0.70, 0.70, 0.70:
bbox AP:98.61, 90.10, 89.62
bev  AP:90.58, 88.71, 88.15
3d   AP:90.17, 86.22, 79.24
aos  AP:98.57, 89.86, 89.20
car AP(Average Precision)@0.70, 0.50, 0.50:
bbox AP:98.61, 90.10, 89.62
bev  AP:98.64, 90.18, 89.74
3d   AP:98.60, 90.15, 89.70
aos  AP:98.57, 89.86, 89.20

Evaluation official_AP_40: car AP(Average Precision)@0.70, 0.70, 0.70:
bbox AP:99.53, 95.63, 93.21
bev  AP:96.69, 91.99, 89.69
3d   AP:93.77, 86.21, 83.61
aos  AP:99.49, 95.34, 92.72
car AP(Average Precision)@0.70, 0.50, 0.50:
bbox AP:99.53, 95.63, 93.21
bev  AP:99.54, 95.88, 95.50
3d   AP:99.52, 95.83, 93.34
aos  AP:99.49, 95.34, 92.72
The result is similar to the experimental results in the paper. However, when I evaluated the model for the 54th epoch, I got unsatisfactory results:
# checkpoint_path = "{Save_Dir}/epoch_54.pth"
Evaluation official_AP_11: car AP(Average Precision)@0.70, 0.70, 0.70:
bbox AP:98.58, 90.02, 89.51
bev  AP:90.58, 88.60, 88.07
3d   AP:90.03, 79.84, 79.01
aos  AP:98.47, 89.76, 89.02
car AP(Average Precision)@0.70, 0.50, 0.50:
bbox AP:98.58, 90.02, 89.51
bev  AP:98.60, 90.12, 89.67
3d   AP:98.56, 90.07, 89.62
aos  AP:98.47, 89.76, 89.02

Evaluation official_AP_40: car AP(Average Precision)@0.70, 0.70, 0.70:
bbox AP:99.51, 95.52, 93.10
bev  AP:96.65, 91.90, 89.58
3d   AP:93.58, 84.14, 81.40
aos  AP:99.41, 95.20, 92.54
car AP(Average Precision)@0.70, 0.50, 0.50:
bbox AP:99.51, 95.52, 93.10
bev  AP:99.52, 95.81, 95.42
3d   AP:99.50, 95.73, 93.25
aos  AP:99.41, 95.20, 92.54
After that, I evaluated the results of the 56th epoch (recall 11, AP is 86.07), and the 50th epoch (recall 11, AP is 80.07) accordingly. The detection results were mutated between these models. The accuracy did not change much for the first 50 epochs, however, during the last few epochs, the model accuracy changed abruptly to AP 86. Have you encountered this problem and What could be the reason for it?

你好，我一直得不到这个结果，不知道您是否可以将训练60个epoch后的模型发我一份呢，我的邮箱是5129675750@qq.com

ZhaoL0 commented 2 years ago

@LuYujing-97 @Eaphan Sorry for my late reply. The epoch-60 model file is attached: Epoch-60. Hope this is helpful for you. Besides, I trained SE-SSD on 2 1080TI GPUs.

Vegeta2020 / SE-SSD

There was an unexpected change in the training results. Is this caused by the teacher-student network? #56