abhi1kumar / DEVIANT

[ECCV 2022] Official PyTorch Code of DEVIANT: Depth Equivariant Network for Monocular 3D Object Detection
https://arxiv.org/abs/2207.10758
MIT License
203 stars 29 forks source link

Local vs Global Orientation #26

Closed xudh1991 closed 9 months ago

xudh1991 commented 10 months ago

I'm very sorry, but I have come across another place that I haven't quite understood and would like to ask for advice The kitti dataset is annotated with alpha, Why obtain alpha through ry conversion. And I found through testing that the converted alpha and the annotated alpha are not exactly the same. I really don't understand why everyone is doing this in engineering. Hope to receive guidance, thank you very much 微信截图_20240112173437

abhi1kumar commented 9 months ago

Hi @xudh1991 Thank you for your interest in DEVIANT again.

Why obtain alpha through ry conversion.

To the best of my knowledge, the idea of regressing local orientation (alpha) instead of yaw / global orientation (ry) is from 3D Bounding Box Estimation Using Deep Learning and Geometry, Mousavian et al., CVPR 2017 paper. Consider Fig. 4 of the above paper, which illustrates the motivation for regressing local over global orientation from the example of a car moving on a highway:

Screenshot from 2024-01-18 14-10-15

The cropped car rotates as it moves away in the frontal view with the same yaw. This rotation creates issues for the frontal view detectors since now the detector has to output the same yaw even though the input images differ. Using local orientation alleviates this problem since the network now regresses different local orientations for these images, and one later converts local to global orientation.

xudh1991 commented 9 months ago

Thank you very much for your answer, it has been very helpful to me