How to reproduce test dataset result (mAP 72.0, NDS 74.1)

Hello, your research is very interesting.

When I used the checkpoint file you provided The mAP(70.3), NDS(72.9) score published in the paper (voxel0075_vov_1600x640_epoch20.pth) can be reproduced identically.

However, when I applied to the test dataset with the same checkpoint and submitted it through eval.ai , the mAP (72.0) and NDS (74.1) as presented in the paper do not come out. (I only got a 70.4/73.0 result.)

According to the paper, it is reported that the results for the test dataset did not use any test time augmentation.

Is there anything I'm missing?

I would appreciate it if you could share how I can use that checkpoint to get the results presented in test dataset as well.

I attached the inference result for test dataset through evalai.

{"metrics_summary": {"label_aps": {"car": {"0.5": 0.7828338643390523, "1.0": 0.8831038224684278, "2.0": 0.9114359996267178, "4.0": 0.9254586561703609}, "truck": {"0.5": 0.41919666800000593, "1.0": 0.622328594464253, "2.0": 0.7160340347397985, "4.0": 0.7378183413995415}, "bus": {"0.5": 0.548469137094366, "1.0": 0.7569229775169954, "2.0": 0.8098595958088101, "4.0": 0.8297213856562474}, "trailer": {"0.5": 0.2730730775488542, "1.0": 0.5935867083776286, "2.0": 0.7732024999405478, "4.0": 0.8328551843153792}, "construction_vehicle": {"0.5": 0.0545915716199332, "1.0": 0.27532892336184356, "2.0": 0.5082597602579472, "4.0": 0.575186651902535}, "pedestrian": {"0.5": 0.8211789322916796, "1.0": 0.8687718930225161, "2.0": 0.8936666985570685, "4.0": 0.9101382415704569}, "motorcycle": {"0.5": 0.6638490136121394, "1.0": 0.7739358495531468, "2.0": 0.8069320185480913, "4.0": 0.8167383949692603}, "bicycle": {"0.5": 0.5159002819221052, "1.0": 0.5857842293615451, "2.0": 0.6163695184606621, "4.0": 0.6328676850785252}, "traffic_cone": {"0.5": 0.7894746430860434, "1.0": 0.8306125183732287, "2.0": 0.8529509465365473, "4.0": 0.8744205955196961}, "barrier": {"0.5": 0.6422856220023887, "1.0": 0.7754488884437414, "2.0": 0.8162615136112452, "4.0": 0.8331002377986595}}, "mean_dist_aps": {"car": 0.8757080856511398, "truck": 0.6238444096508997, "bus": 0.7362432740191047, "trailer": 0.6181793675456024, "construction_vehicle": 0.35334172678556475, "pedestrian": 0.8734389413604302, "motorcycle": 0.7653638191706594, "bicycle": 0.5877304287057094, "traffic_cone": 0.8368646758788788, "barrier": 0.7667740654640086}, "mean_ap": 0.7037488794231997, "label_tp_errors": {"car": {"trans_err": 0.1743480709793415, "scale_err": 0.13566460581490586, "orient_err": 0.04725901192730328, "vel_err": 0.22679931127815556, "attr_err": 0.12431389628784703}, "truck": {"trans_err": 0.3427289728502886, "scale_err": 0.17527353027035772, "orient_err": 0.04350761314384568, "vel_err": 0.29796317726759314, "attr_err": 0.12388885083462146}, "bus": {"trans_err": 0.28201606037639, "scale_err": 0.16712478110600337, "orient_err": 0.0363636049951514, "vel_err": 0.4374811013847662, "attr_err": 0.2975466820118584}, "trailer": {"trans_err": 0.4594987549418118, "scale_err": 0.1598362470200802, "orient_err": 0.7090668615659704, "vel_err": 0.21990061845424608, "attr_err": 0.1213828314115689}, "construction_vehicle": {"trans_err": 0.6531381380178928, "scale_err": 0.37860272173948156, "orient_err": 0.9052422059689905, "vel_err": 0.09225026693939813, "attr_err": 0.05580084216309439}, "pedestrian": {"trans_err": 0.14896347137302224, "scale_err": 0.29168604291809497, "orient_err": 0.2837789415908823, "vel_err": 0.1934540022925848, "attr_err": 0.10823283310883593}, "motorcycle": {"trans_err": 0.20741331105274743, "scale_err": 0.2121743254327902, "orient_err": 0.1789094241410364, "vel_err": 0.5424214603537325, "attr_err": 0.07602250764905477}, "bicycle": {"trans_err": 0.2341929718113284, "scale_err": 0.26988925423439813, "orient_err": 0.41378644507819073, "vel_err": 0.24995264273669682, "attr_err": 0.03577556771890466}, "traffic_cone": {"trans_err": 0.1411094561318333, "scale_err": 0.33489553379841935, "orient_err": NaN, "vel_err": NaN, "attr_err": NaN}, "barrier": {"trans_err": 0.23116747767388468, "scale_err": 0.273886442685595, "orient_err": 0.03039179741959808, "vel_err": NaN, "attr_err": NaN}}, "tp_errors": {"trans_err": 0.2874576685208541, "scale_err": 0.23990334850201264, "orient_err": 0.2942562117589965, "vel_err": 0.2825278225883967, "attr_err": 0.11787050139822319}, "tp_scores": {"trans_err": 0.7125423314791459, "scale_err": 0.7600966514979873, "orient_err": 0.7057437882410035, "vel_err": 0.7174721774116033, "attr_err": 0.8821294986017768}, "nd_score": 0.7296728844347515, "eval_time": 308.0731339454651}, "result": {"mAP": 0.7037488794231997, "mATE": 0.2874576685208541, "mASE": 0.23990334850201264, "mAOE": 0.2942562117589965, "mAVE": 0.2825278225883967, "mAAE": 0.11787050139822319, "NDS": 0.7296728844347515}}

Evaluation on validation dataset

Evaluation on test dataset from evalai

junjie18 / CMT

How to reproduce test dataset result (mAP 72.0, NDS 74.1) #115