Megvii-BaseDetection / BEVDepth

Official code for BEVDepth.
MIT License
730 stars 102 forks source link

MatrixVT exps #137

Open GoldenAbel opened 1 year ago

GoldenAbel commented 1 year ago

Hi, replace "bev_depth_lss_r50_256x704_128x128_20e_cbgs_2key_da_ema.py" with "MatrixVT", after 32 epochs' training, got AP 0.3243 and NDS 0.4091 on val dataset.

What can i do to improve, Thank you.

ZRandomize commented 1 year ago

depth aggregation is not supported in MatrixVT backbone, so set self.backbone_conf['use_da'] = True didn't work.

ZRandomize commented 1 year ago

Besides, multi-frame (2key) is currently not supported in MatrixVT, sweep idx is set to 0 during VT. see here. You can add multi-frame support into this file.

chickenta2ta commented 1 year ago

@ZRandomize Hi, Thank you for the great work about MatrixVT. I am willing to add the functionality of multi-frame (2key) to MatrixVT in order to reproduce the result of the paper of MatrixVT, but I'm not sure how to do it. Multi-frame (2key) creates two BEV features (current frame and previous one) and concatenates them, and then passes the concatenated feature to a detector. Is my understanding right?

ZRandomize commented 1 year ago

yes, indeed. BTW, there are also RNN-based methods like VideoBEV.

chickenta2ta commented 1 year ago

@ZRandomize Thank you for your quick response! I'll check VideoBEV, too.