autonomousvision / neat

[ICCV'21] NEAT: Neural Attention Fields for End-to-End Autonomous Driving
MIT License
303 stars 47 forks source link

Irregular shape of BEV #13

Closed UditSinghParihar closed 1 year ago

UditSinghParihar commented 1 year ago

Hi @kashyap7x , thanks for your wonderful work on NEAT. I would like to ask regarding the blobby nature of BEV output as compared to other BEV methods like LSS or Fiery, where BEV of vehicles is quite sharp. Can you tell some pointers on how we can improve this irregular vehicle bev coming in NEAT?

image

Thanks

kashyap7x commented 1 year ago

Hello!

Since the BEV segmentation was an auxiliary task, we did not focus on improving its sharpness in the paper. Some ideas which I think could help include using multi-resolution feature grids instead of simple feature vectors for conditioning, a better initialization of the attention weights than uniform (using information about the camera matrix), and removing future points from the semantics to simplify the task. Another simple thing to try would be tuning the weights of the semantic loss, and maybe setting class-specific weights.

In addition, implicit semantic segmentation has progressed significantly since early 2021, and some ideas from current state-of-the-art papers could be incorporated to sharpen the NEAT semantics, e.g.,

  1. Panoptic Neural Fields: A Semantic Object-Aware Neural Scene Representation
  2. Semantic Scene Completion using Local Deep Implicit Functions on LiDAR Data