Cliu2 / MTrans

The PyTorch implementation of 'Multimodal Transformer for Automatic 3D Annotation and Object Detection'.
Mozilla Public License 2.0
29 stars 5 forks source link

There are some issues regarding the Hard performance, which is quite important to me. #13

Closed ljwwwiop closed 3 months ago

ljwwwiop commented 3 months ago

image Hi! I have a job to submit to AAAI recently. In it, some of the labs attempted to compare the performance improvements in the self-training paradigm. I found that I used the checkpoint you provided and did not modify the model to generate pseudo-labeled data (training data and validation data) for 3D detection training.

image

A total of 2712 (training data) / 3380 (validation data) were generated. Later, I used the commands of PCDet to re-create all the data content required for training. I found that the result of PointRCNN for 3D detection still had a relatively large performance difference from the paper on the hard cases, which made it very difficult for me to compare.

image

image

And I repeated the entire experiment several times and found that the experimental results still couldn't approach the ones mentioned in the paper. Also, I discovered that it seemed that MTrans only needed to generate the pseudo-labels required for the training data? It seemed that pseudo-labels were not necessary for the validation set. My last experimental result is the content in the above picture. I used KITTI 3D GT for the validation set, and the training set was (2712 pseudo-labeled scenes + 500 GT scenes).

I'd like to ask if this process of mine is the same as the one in your laboratory? Or are there any steps that I have missed? The final result of your method is very important to me, so I hope to receive your reply. Thank you!

ljwwwiop commented 3 months ago

image

I really want to know if you have used the pcdet command to regenerate the entire training pkl and gt_database, because I found that kitti_dbinfos_train.pkl has a very significant impact on the final training result.

ljwwwiop commented 3 months ago

image

And in the eval mode, the 3D IoU and recall performance on the validation dataset are basically the same, possibly also because we directly used the checkpoint you provided.

Cliu2 commented 3 months ago

Hi,

And I repeated the entire experiment several times and found that the experimental results still couldn't approach the ones mentioned in the paper. Also, I discovered that it seemed that MTrans only needed to generate the pseudo-labels required for the training data? It seemed that pseudo-labels were not necessary for the validation set. My last experimental result is the content in the above picture. I used KITTI 3D GT for the validation set, and the training set was (2712 pseudo-labeled scenes + 500 GT scenes).

Yes, you only need to generate pseudo-labels for the training set only. For the validation set, the original labels should be used otherwise the evaluation does not make sense.

I really want to know if you have used the pcdet command to regenerate the entire training pkl and gt_database, because I found that kitti_dbinfos_train.pkl has a very significant impact on the final training result.

And yes, pcdet command is used to rebuild the entire training set of KITTI. We do this for every experiment listed in the paper.

In your last post, it seems the generated pseudo labels are of similar quality to our results, so I think there is no problem for MTrans when generating pseudo labels. I don't have much clue why your results look weird.

Have you tested every checkpoint when using pcdet codebase and MTrans pseudo labels to train PointRCNN? From our experience, the performance could also fluctuate a lot from checkpoint to checkpoint. Simply taking the last checkpoint usually results in suboptimal AP scores. The performance of pcdet PointRCNN also seems to vary from run to run.

ljwwwiop commented 3 months ago

Thank you very much for your reply and for confirming some of my processing procedures. It's very useful. I used the checkpoint of the last round after training with PointRCNN. Maybe this is indeed the reason for the deviation. I will try to check this content. Thank you very much for your reply. In the work we are preparing to submit for publication, I have also cited your work. MTrans is a very effective and understandable work. Thank you for your timely reply!!!~~~

ljwwwiop commented 3 months ago

Hi, I re-tested several pieces. The final result is close to the one in your paper. This result seems acceptable with an acceptable error.

image