Closed Physu closed 3 years ago
I train the 3DSSD followed the configs in configs/3dssd/3dssd_kitti-3d-car.py with train+val data and modified the batchsize from 4 to 8, modified the lr from 0.002 to 0.004, the rest keep as origin. The test result(under AP40):
Benchmark | Easy | Moderate | Hard |
---|---|---|---|
Car (Detection) | 94.91 % | 91.35 % | 87.47 % |
Car (Orientation) | 0.01 % | 0.47 % | 0.63 % |
Car (3D Detection) | 86.06 % | 76.48 % | 69.71 % |
Car (Bird's Eye View) | 91.65 % | 86.69 % | 81.05 % |
There exits a large margin between the official 3DSSD(76.48 vs 79.55). I feel confused about this, did I set something wrong? Or what can I do to make up this performance gap? Thanks
The reason for the performance difference has been explained in the README page. Among the differences, there are two most important ones: different evaluation code and different train/val set. The first one can yield about 2 mAP difference as said in the readme while the second one will at least remove the influence of false positive predictions in those samples without ground truths.
In addition, we also regress the benchmark by evaluating our results with their evaluation code and evaluating their results with our evaluation code. The results are almost the same. (Actually, we only reproduce the 79.26 mAP with the official code according to the record of @encore-zhou.)
As for the difference on the test set, there exist some uncertainty and tricks. Have you ever tried to train a model with the official code and submit the result to the benchmark?
Thanks for your feedback! Official code was implemented by Tensorflow, I will try train a model and submit the result to the test and evaluate the performance. New results will be updated here, as soon as I get it.
By the way, 79.26 is evaluated on val data or test data? If result was evaluated on test data, 79.26 vs 79.55(official in test data), the margin is acceptable. My result on test data was 3 mAP margin, it is unacceptable.
Actually, we only reproduce the 79.26 mAP with the official code according to the record of @encore-zhou
It's evaluated on their val dataset and with their evaluation code (compared with the reported 83.3). So I guess there is a large range of fluctuation in terms of performance on the validation set. You can have a try first and let's have a closer look into whether there is a gap between our implementation and the official one.
Got it, I will try to reproduce result by following official code.
I use the official implementation and configs train models in Docker container. The python packages are listed below: tensorflow 1.4.0 tensorflow-tensorboard 0.4.0 python 3.5 cuda 9.0 numpy 1.14.5
total train iterations: 80700 final ckptfile: model-79893(not 80700 as final ckptfile)
the result: the model-79893 the model-79086 the model-78279 the model-77472
Benchmark | iterations | Easy | Moderate | Hard |
---|---|---|---|---|
Car (Detection) | 77472 | 89.70 % | 82.84 % | 79.97 % |
Car (Detection) | 78279 | 89.29 % | 82.69 % | 80.06 % |
Car (Detection) | 79086 | 91.14 % | 82.79% | 80.02 % |
Car (Detection) | 79893 | 89.39 % | 82.54% | 79.83 % |
It seems official model evaluation results are better than MMdet3D, but the reason needs further study to find out.
It's a little strange because when we reproduce 3DSSD, @encore-zhou only got the following performance with the official code:
Maybe there is some fluctuation in performance?
Maybe the author improved the code implementation? There is something cause the performance gap. I will check the 3DSSD head, hope can find something to explain this situation.
And this is new results gotten by minutes ago.
By the way, this results is trained with more epoch, can see that the performance further improved( reach 82.9%).
Yes, it is really strange because we reproduce the above results on Aug. 2020 (as shown in the screenshot) and there are no updates after April 2020. We will check this issue recently. In the meantime, if you have any progress, please feel free to share it here.
Thanks for reopening this issue! New findings will be updated.
Using pytorch1.5 mmdet 1.3.9 mmdet3d 0.14.0 mmcv-full 1.3.9 ubuntu 18.4
I use official configs in configs/3DSSD/3dssd_4x4_kitti-3d-car.py
and modified the single GPU batchsize from 4 to 8(because I use 2 GPUs, the official config setting is 4 GPUs), lr_rate and epoch keep as origin.
Trained a model with 2 2080Ti GPUs and with Full train data(7481). Finally I get a validation results on valsplit (3, 769 samples):
Then I generate test submission file, and submit it to test server:
The performance is not good as I expected, I just don't know why. Could you please give some opinions on this performance?
I find it is hard to reproduce the results on KITTI test, though you could have gotten a good result on val already.
If we set the confidence threshold great than 0.0(default, output all the plausible predictions), e.g. 0.2 to filter the final predictions in predictions_in_test.txt, we will get: . Note that in configs, you can define your threshold:
test_cfg=dict(
nms_cfg=dict(type='nms', iou_thr=0.1),
sample_mod='spec',
score_thr=0.0, # Attention!!!
per_class_proposal=True,
max_output_num=100))
Though there is some improvement, it is far from 79.57 in moderate (3DSSD in leaderboard). I guess a good post processing is needed,but the other skills which can improve performance are sitll a question.
@Physu Have you ever tried generating submission using the official code and submit it to the test server to see the test set result? Also, it seems to me that, changing mmdet3d's training batch and GPUs from 4x4
to 8x2
improves val set results a lot?
Please kindly provide more observations and I will try to look into this issue.
@Wuziyi616 Thanks for your attention! Does offcial code mean dvlab-research/3DSSD or other methods? Besides, in order to learn more about the evaluation procedure, I use traveller59/kitti-object-eval-python to test results on val set(e.g. get every LiDAR bin's results and save it in a txt file, finally get 3769 txt files). I find when no other post processing involved, the results: which is slightly better than mmdet3d evaluation result(Maybe it is unfair to compare this way, for the hyperparameters may change): If I use a confidence threshold 0.2 to filter out the false positive, the result further improved:
I will reproduce on 4*4, and we will see the difference further.
@Wuziyi616 Thanks for your attention! Does offcial code mean dvlab-research/3DSSD or other methods? Besides, in order to learn more about the evaluation procedure, I use traveller59/kitti-object-eval-python to test results on val set(e.g. get every LiDAR bin's results and save it in a txt file, finally get 3769 txt files). I find when no other post processing involved, the results: which is slightly better than mmdet3d evaluation result(Maybe it is unfair to compare this way, for the hyperparameters may change): If I use a confidence threshold 0.2 to filter out the false positive, the result further improved:
Exactly, the official code I said is the dvlab's code. I think that's the official code release for 3DSSD isn't it? As you mentioned in this reply, you said you would like to submit test results using that code, have you done that?
Thanks for your attention,my opportunity is running out, the results will be updated soon.
@Physu Have you tried to reproduce the multi-class version of 3dssd (that is, predict car, pedestrian and cyclist at the same time)?
@Physu Hi, have you ever tried generating submission using the official code and submit it to the test server to see the test set result?
Thanks for developers extraordinary work! I have a question about 3DSSD evaluation result between author and MMDet3D implementation. The author's release result:
In MMDet3D, the result:
I notice "Experiment details on KITTI datasets", which shows the difference between official implementation.
1.Official implementation based on Tensorflow1.4, but I guess pytorch is not the reason of poor performance, or tensorflow and pytorch exist performance gap? 2.It is about two percent margin(81.0 and 83.3) between two implementation, can we come up with some methods to fix it?
I also use single2080Ti to train a train+val model with configs/3DSSD/3dssd_kitti-3d-car.py, I modified the
ann_file=data_root + 'kitti_infos_train.pkl',
toann_file=data_root + 'kitti_infos_trainval.pkl',
the rest code was kept as origin. when the train was finished, I will evaluate on test, and get the result there to discuss. Thanks again!