XuyangBai / TransFusion

[PyTorch] Official implementation of CVPR2022 paper "TransFusion: Robust LiDAR-Camera Fusion for 3D Object Detection with Transformers". https://arxiv.org/abs/2203.11496
Apache License 2.0
619 stars 76 forks source link

Question about degenerated image quality #29

Closed zhulf0804 closed 2 years ago

zhulf0804 commented 2 years ago

Hello Xuyang, Thanks for your great work and open-source code.

About the experiments on degenerated image quality, TransFusion work well. I'm confused and have two questions. (1) What about other value(instead of 0) padding, does it make greater effect on TransFusion performance ? (2) How is TransFusion affected when the image quality is very bad (without any useful information in case of light breakdown at night) ? What attention map (Figure 3) should be ?

Looking forward to your reply.

XuyangBai commented 2 years ago

Hi, Thanks for your interest.

1) We have tried zero-padding or random-padding, in both cases TransFusion works well. 2) Note that the network indeed does not have any information about the image quality, but relies on the attention to dynamically determine whether the image information is useful or not. The zero-padding (to simulate the missing camera) and random-padding experiments are good examples of that situations. And the attention map is expected to be sparse and have a low value for all the image pixels.

If you are interested in the robustness of image quality, here is a related paper: "Benchmarking the Robustness of LiDAR-Camera Fusion for 3D Object Detection" which proposes another noisy camera data case (camera lens occlusions)

zhulf0804 commented 2 years ago

Thanks for your patient reply which helps me figure out the results.