johnwlambert / tbv

Official Repo of NeurIPS '21: "Trust, but Verify: Cross-Modality Fusion for HD Map Change Detection"
Other
78 stars 16 forks source link

Can data preprocessing be accelerated? #10

Closed qiaozhijian closed 2 years ago

qiaozhijian commented 2 years ago

Thanks for your great work. While model inference is fast, map and semantic image rendering is very slow. The preprocessing of the entire dataset cannot even be done in a few days. Will the data preprocessing code be further optimized?

johnwlambert commented 2 years ago

Thanks for your interest in the work. Certainly the map and semantic image rendering can be accelerated in many ways. The code for rendering is largely unoptimized currently (except for the CUDA ray-tracing routines). I welcome pull requests or contributions towards the following efforts:

  1. BEV image rendering: I kept the resolution at 0.02 meter/pixel for the BEV rendering configs. At 20 meters in all directions, this leads to 40/0.02 = 2000 x 2000 pixel images. These high-res images were useful for data annotation, but are not needed for training, since we resize to 234 x 234 px anyways (see config and transform code). By decreasing the resolution from 0.02 to something like 0.17 meters/px, you would get 235 x 235 images immediately from rendering, and likely get a ~10x speedup in the slow interpolation step of the rendering.
  2. Ego-view image rendering: This code could be ported to OpenGL to make it run very fast.
  3. Map augmentations: This code logic could be ported to C++ instead of Python, and wrapped with pybind11, to make it run very fast.
qiaozhijian commented 2 years ago

Thank you for your suggestions!