Haiyang-W / UniTR

[ICCV2023] Official Implementation of "UniTR: A Unified and Efficient Multi-Modal Transformer for Bird’s-Eye-View Representation"
https://arxiv.org/abs/2308.07732
Apache License 2.0
276 stars 16 forks source link

关于复现时在test集上指标低 #11

Closed lihuashengmax closed 9 months ago

lihuashengmax commented 9 months ago

作者您好,我按照readme复现了unitr+lss版本,实验硬件为4*A40,由于每张A40显存为45G左右,即4张A40相当于8张3090,因此训练时能够保持总batchsize不变,但是提交test集的结果却比论文中的低了2个点,请问可能是什么原因呢? ClassIn_20231205154749 1701762708744

nnnth commented 9 months ago

During evaluation on the test set, training should involve both the training and validation sets. Additionally, the ground truth database should include the validation set. This approach is also adopted in BEVFusion and likely to yield an improvement of approximately 1~1.5 points. I suspect that's the reason.

Haiyang-W commented 9 months ago

Yes, this is a common trick used in Multi-modal 3D Detection, such as BEVFusion.

lihuashengmax commented 9 months ago

好的,感谢,我试一下

LinuxCup commented 6 months ago

@lihuashengmax ,您好!正如您上面所说,增加了val数据集性能有提升吗? 谢谢了

lihuashengmax commented 3 months ago

@lihuashengmax ,您好!正如您上面所说,增加了val数据集性能有提升吗? 谢谢了

有一定提升,但仍未达到论文中在test集上的水平