hustvl / MapTR

[ICLR'23 Spotlight & IJCV'24] MapTR: Structured Modeling and Learning for Online Vectorized HD Map Construction
MIT License
1.14k stars 173 forks source link
autonomous-driving bev end-to-end iclr2023 online-hdmap-construction real-time shape-representation transformer vectorized-hdmap

MapTR

An End-to-End Framework for Online Vectorized HD Map Construction

[Bencheng Liao](https://github.com/LegendBC)1,2,3 \*, [Shaoyu Chen](https://scholar.google.com/citations?user=PIeNN2gAAAAJ&hl=en&oi=sra)1,3 \*, [Yunchi Zhang](https://github.com/zyc10ud)1,3 , [Bo Jiang](https://github.com/rb93dett)1,3 ,[Tianheng Cheng](https://scholar.google.com/citations?user=PH8rJHYAAAAJ&hl=zh-CN)1,3, [Qian Zhang](https://scholar.google.com/citations?user=pCY-bikAAAAJ&hl=zh-CN)3, [Wenyu Liu](http://eic.hust.edu.cn/professor/liuwenyu/)1, [Chang Huang](https://scholar.google.com/citations?user=IyyEKyIAAAAJ&hl=zh-CN)3, [Xinggang Wang](https://xwcv.github.io)1 :email: 1 School of EIC, HUST, 2 Institute of Artificial Intelligence, HUST, 3 Horizon Robotics (\*) equal contribution, (:email:) corresponding author. ArXiv Preprint ([arXiv 2208.14437](https://arxiv.org/abs/2208.14437)) [openreview ICLR'23](https://openreview.net/forum?id=k7p_YAO7yE), accepted as **ICLR Spotlight** extended ArXiv Preprint MapTRv2 ([arXiv 2308.05736](https://arxiv.org/abs/2308.05736)), accepted to [**IJCV 2024**](https://link.springer.com/article/10.1007/s11263-024-02235-z)

#

News

Introduction

MapTR/MapTRv2 is a simple, fast and strong online vectorized HD map construction framework.

framework

High-definition (HD) map provides abundant and precise static environmental information of the driving scene, serving as a fundamental and indispensable component for planning in autonomous driving system. In this paper, we present Map TRansformer, an end-to-end framework for online vectorized HD map construction. We propose a unified permutation-equivalent modeling approach, i.e., modeling map element as a point set with a group of equivalent permutations, which accurately describes the shape of map element and stabilizes the learning process. We design a hierarchical query embedding scheme to flexibly encode structured map information and perform hierarchical bipartite matching for map element learning. To speed up convergence, we further introduce auxiliary one-to-many matching and dense supervision. The proposed method well copes with various map elements with arbitrary shapes. It runs at real-time inference speed and achieves state-of-the-art performance on both nuScenes and Argoverse2 datasets. Abundant qualitative results show stable and robust map construction quality in complex and various driving scenes.

Models

Results from the MapTRv2 paper

comparison

Method Backbone Lr Schd mAP FPS
MapTR R18 110ep 45.9 35.0
MapTR R50 24ep 50.3 15.1
MapTR R50 110ep 58.7 15.1
MapTRv2 R18 110ep 52.3 33.7
MapTRv2 R50 24ep 61.5 14.1
MapTRv2 R50 110ep 68.7 14.1
MapTRv2 V2-99 110ep 73.4 9.9

Notes:

Results from this repo.

MapTR

nuScenes dataset

Method Backbone BEVEncoder Lr Schd mAP FPS memory Config Download
MapTR-nano R18 GKT 110ep 46.3 35.0 11907M (bs 24) config model / log
MapTR-tiny R50 GKT 24ep 50.0 15.1 10287M (bs 4) config model / log
MapTR-tiny R50 GKT 110ep 59.3 15.1 10287M (bs 4) config model / log
MapTR-tiny Camera & LiDAR GKT 24ep 62.7 6.0 11858M (bs 4) config model / log
MapTR-tiny R50 bevpool 24ep 50.1 14.7 9817M (bs 4) config model / log
MapTR-tiny R50 bevformer 24ep 48.7 15.0 10219M (bs 4) config model / log
MapTR-tiny+ R50 GKT 24ep 51.3 15.1 15158M (bs 4) config model / log
MapTR-tiny+ R50 bevformer 24ep 53.3 15.0 15087M (bs 4) config model / log

Notes:

MapTRv2

Please git checkout maptrv2 and follow the install instruction to use following checkpoint

nuScenes dataset

Method Backbone BEVEncoder Lr Schd mAP FPS memory Config Download
MapTRv2 R50 bevpool 24ep 61.4 14.1 19426M (bs 24) config model / log
MapTRv2* R50 bevpool 24ep 54.3 WIP 20363M (bs 24) config model / log

Argoverse2 dataset

Method Backbone BEVEncoder Lr Schd mAP FPS memory Config Download
MapTRv2 R50 bevpool 6ep 64.3 14.1 20580 (bs 24) config model / log
MapTRv2* R50 bevpool 6ep 61.3 WIP 21515 (bs 24) config model / log

Notes:

Qualitative results on nuScenes val split and Argoverse2 val split

MapTR/MapTRv2 maintains stable and robust map construction quality in various driving scenes.

visualization

MapTRv2 on whole nuScenes val split

Youtube

MapTRv2 on whole Argoverse2 val split

Youtube

End-to-end Planning based on MapTR

https://user-images.githubusercontent.com/26790424/229679664-0e9ba5e8-bf2c-45e0-abbc-36d840ee5cc9.mp4

Getting Started

Catalog

Acknowledgements

MapTR is based on mmdetection3d. It is also greatly inspired by the following outstanding contributions to the open-source community: BEVFusion, BEVFormer, HDMapNet, GKT, VectorMapNet.

Citation

If you find MapTR is useful in your research or applications, please consider giving us a star 🌟 and citing it by the following BibTeX entry.

@article{liao2024maptrv2,
  title={Maptrv2: An end-to-end framework for online vectorized hd map construction},
  author={Liao, Bencheng and Chen, Shaoyu and Zhang, Yunchi and Jiang, Bo and Zhang, Qian and Liu, Wenyu and Huang, Chang and Wang, Xinggang},
  journal={International Journal of Computer Vision},
  pages={1--23},
  year={2024},
  publisher={Springer}
}
@inproceedings{MapTR,
  title={MapTR: Structured Modeling and Learning for Online Vectorized HD Map Construction},
  author={Liao, Bencheng and Chen, Shaoyu and Wang, Xinggang and Cheng, Tianheng, and Zhang, Qian and Liu, Wenyu and Huang, Chang},
  booktitle={International Conference on Learning Representations},
  year={2023}
}
@inproceedings{liao2025lane,
  title={Lane graph as path: Continuity-preserving path-wise modeling for online lane graph construction},
  author={Liao, Bencheng and Chen, Shaoyu and Jiang, Bo and Cheng, Tianheng and Zhang, Qian and Liu, Wenyu and Huang, Chang and Wang, Xinggang},
  booktitle={European Conference on Computer Vision},
  pages={334--351},
  year={2024},
  organization={Springer}
}