Open Yqmtf11 opened 2 years ago
Hi, which model are you using? The model DETR3D, VoVNet on trainval, evaluation on test set is trained on the combination of training and validation set, and is used for submission of the leaderboard(test set).
May I ask how did you perform evaluation on the test set? 83.4%mAP seems to be overfitting ? @a1600012888
May I ask how did you perform evaluation on the test set? 83.4%mAP seems to be overfitting ? @a1600012888
Do you mean the DETR3D, VoVNet on trainval, evaluation on test set model with 41.2 mAP on test set.
The results is from submitting on the leaderboard. It is also the performance you see on nuscenes leaderboard.
Yes, I mean the DETR3D, VoVNet on trainval, evaluation on test set model but I am asking about the mAP,NDS in the log, which subset did you perform evaluation on?
Hi. Should be on the validation.
Remember we don't have access for the annotation of test set(it is a heldout set for the leaderboard)
Thanks, I got it. I was confused by the super high mAP, now I realize you trained on both train and val set. @a1600012888
Just for curious, have you evaluated the results of above + CBGS on the test set? I wonder how much gap it will have for the results on val set and test set
@WangYueFt @XuyangBai @Yqmtf11 @a1600012888 hello, have you guys trained the vovnet on nuscenes training set, how about the metrics on val set? As for resnet version, I only found the val result of resnet from the original paper but no metrics on test set.
'pts_bbox_NuScenes/NDS': 0.8238093679962551, 'pts_bbox_NuScenes/mAP': 0.83450670209427} Per-class results: Object Class AP ATE ASE AOE AVE AAE car 0.873 0.256 0.114 0.033 0.260 0.195 truck 0.832 0.327 0.115 0.033 0.191 0.216 bus 0.842 0.323 0.104 0.027 0.293 0.245 trailer 0.778 0.395 0.116 0.041 0.136 0.124 construction_vehicle 0.783 0.405 0.173 0.079 0.137 0.322 pedestrian 0.805 0.380 0.203 0.181 0.245 0.136 motorcycle 0.821 0.337 0.150 0.085 0.347 0.213 bicycle 0.871 0.271 0.169 0.079 0.155 0.009 traffic_cone 0.877 0.241 0.162 nan nan nan barrier 0.862 0.289 0.110 0.050 nan nan Compared with the index mAP=34.6 in the paper, 83.4 is extremely abnormal when the training model provided by the author is tested on the verification set. Has anyone ever been in that situation?