[ECCV'24] Online Vectorized HD Map Construction Using Geometry
[Zhixin Zhang](https://github.com/cnzzx)1, [Yiyuan Zhang](https://invictus717.github.io/)2, [Xiaohan Ding](https://dingxiaohan.xyz/)3, [Fusheng Jin](https://cs.bit.edu.cn/szdw/jsml/fjs/jfs/index.htm)1\*, [Xiangyu Yue](http://xyue.io/)2
1Beijing Institute of Technology,
2CUHK,
3Tencent AI Lab
[Website](https://invictus717.github.io/GeMap/) | [arXiv](https://arxiv.org/abs/2312.03341) | [YouTube](https://www.youtube.com/watch?v=dU4XN4GQ1y4) | [Bilibili](https://www.bilibili.com/video/BV1qN4y1e7hL/?vd_source=96a766e4a548cf05b04bf247d9824a01) | [Zhihu](https://zhuanlan.zhihu.com/p/671139382)
News
We're working on more powerful and efficient models, please stay tuned.
- (2024/7/2) GeMap is accepted by ECCV 2024 and we release a new GeMap model with 76.0 mAP.
- (2023/12/7) We released the first version of GeMap (with pre-trained checkpoints and evaluation).
- (2023/12/7) GeMap is released on arXiv.
Motivation
- Recent efforts have built strong baselines for online vectorized HD map construction task, however, shapes and relations of instances in urban road systems are still under-explored, such as parallelism, perpendicular, or rectangle-shape.
- As the ego vehicle moves, the shape of a specific instance or the relations between two instances will remain unchanged. To accurately represent such geometric features, invariance to rigid transformation is a fundamental property.
Highlights
This work contributes from two perspectives:
- GeMap achieves new state-of-the-art performance on the NuScenes and Argoverse 2 datasets. Remarkably, it reaches a 71.8% mAP on the large-scale Argoverse 2 dataset, outperforming MapTR V2 by +4.4% and surpassing the 70% mAP threshold for the first time.
- GeMap end-to-end learns Euclidean shapes and relations of map instances beyond basic perception. Specifically, we design a geometric loss based on angle and distance clues, which is robust to rigid transformations. We also decouple self-attention to independently handle Euclidean shapes and relations.
Quantitative Results
NuScenes
Model |
Objective |
Backbone |
Epoch |
mAP |
FPS |
Config |
Checkpoint |
GeMap |
simple |
R50 |
110 |
62.7 |
15.6 |
config |
model |
GeMap |
simple |
Camera(R50) & LiDAR(SEC) |
110 |
66.5 |
6.8 |
config |
model |
GeMap |
full |
R50 |
110 |
69.4 |
13.3 |
config |
model |
GeMap |
full |
Swin-T |
110 |
72.0 |
10.0 |
config |
model |
GeMap |
full |
V2-99 |
110 |
72.2 |
9.5 |
config |
model |
GeMap |
full |
V2-99(DD3D) |
110 |
76.0 |
9.5 |
config |
model |
Argoverse 2
Model |
Objective |
Backbone |
Epoch |
mAP |
FPS |
Config |
Checkpoint |
GeMap |
simple |
R50 |
6 |
63.9 |
13.5 |
config |
model |
GeMap |
simple |
R50 |
24 |
68.2 |
13.5 |
config |
model |
GeMap |
full |
R50 |
24 |
71.8 |
12.1 |
config |
model |
* All models are trained on 8 NVIDIA RTX3090 GPUs. The speed (Frames Per Second, FPS) is evaluated on a single 3090 GPU.
Visualization Results
Comparison Video
GeMap exhibits more robust predictions in occluded and rotated scenarios, especially under rainy weather conditions.
More Cases of GeMap
Getting Started
TODO
- [ ] Faster implementation for inference of GeMap.
- [ ] More powerful LiDAR and Camera + LiDAR models.
- [ ] Lighter and faster models with 30+ FPS.
Acknowledgements
GeMap is based on mmdetection3d. It is also greatly inspired by the following outstanding contributions to the open-source community: LSS, GKT, Swin-Transformer, VoVNet, BEVFormer, MapTR, BeMapNet, HDMapNet.
Citation
If the paper and code help your research, please kindly cite:
@article{zhang2023online,
title={Online Vectorized HD Map Construction using Geometry},
author={Zhang, Zhixin and Zhang, Yiyuan and Ding, Xiaohan and Jin, Fusheng and Yue, Xiangyu},
journal={arXiv preprint arXiv:2312.03341},
year={2023}
}