Nightmare-n / UniPAD

UniPAD: A Universal Pre-training Paradigm for Autonomous Driving (CVPR 2024)
https://arxiv.org/abs/2310.08370
Apache License 2.0
167 stars 7 forks source link

Questions about cfgs & replicated results #5

Closed JohnFuguiWang closed 6 months ago

JohnFuguiWang commented 8 months ago

Hello, thanks for your excellent work! I found your method concise and elegant, and these days I'm trying to replicate it. I have a few questions about the code. 1. It seems that the provided code only contains camera modality and the lidar modality is missing. Could you please provide the config files of UniPAD that use multi-modality and lidar-only UVTR? It would be very helpful. 2. I ran the code successfully, but I found the result metric was lower than that in the paper. I ran the project in many ways, all of them are based on the complete nuScenes dataset and your configs: a. I pretrained and finetuned the model from scratch. b. From the provided Google drive link, I downloaded your official pretrained pth file, and finetuned the model based on it. c. I downloaded your official pretrained and finetuned pth files, and directly test the model based on your finetuned pth.
Sadly, the result mAP metrics of the 3 ways are all approximately 32. Due to the camera-only setting in the code, I think the corresponding mAP metric in the paper should be UVTR-C + UniPAD, which is 41.5. So I wonder did I miss something critical?
Looking forward to your early reply, and thanks in advance!

JohnFuguiWang commented 8 months ago

Besides, the results of aforementioned a. and b. experiments are from the val process during fintune training. The result of c. is from the test process, where the command I use is "python extra_tools/test.py projects/configs/unipad_abl/uvtr_convnext_finetune_full.py .../UniPAD-main/data/ckpts/uvtr_convnext_s_vs0.1_c128_finetune_epoch_12.pth --eval mAP".

JohnFuguiWang commented 8 months ago

By the way, today I accidentally find that the dataset load_interval in the finetune config file is set as 2 (line 231 in uvtr_convnext_s_vs0.1_c128_finetune.py), which I think might means that the model loads uncomplete dataset. Did you use this setting during the experiments?

Nightmare-n commented 7 months ago

Thanks for your interest! We currently only release the code for image modality (the released config file aligns with the ablation experiment in the paper). The complete code for producing the final results will be released soon.