How to derive 3D Bounding Box in TransVG + backproj methods in your paper?

ZhanYang-nwpu / Mono3DVG

[AAAI 2024] Mono3DVG: 3D Visual Grounding in Monocular Images, AAAI, 2024

33 stars 1 forks source link

How to derive 3D Bounding Box in TransVG + backproj methods in your paper? #11

Closed yfthu closed 3 months ago

yfthu commented 3 months ago

Hello, thank you for your work. In your paper, you provide many 2D grounding + backproj methods as baselines, such as ReSC + backproj, TransVG + backproj. I would like to know how to derive 3D bounding boxes in these 2D grounding + backproj methods. Do you predict depth in these 2D grounding + backproj methods? What depth do you use when you conduct back projection? Thank you very much!

yfthu commented 3 months ago

@ZhanYang-nwpu

ZhanYang-nwpu commented 3 months ago

Hello, thank you for your work. In your paper, you provide many 2D grounding + backproj methods as baselines, such as ReSC + backproj, TransVG + backproj. I would like to know how to derive 3D bounding boxes in these 2D grounding + backproj methods. Do you predict depth in these 2D grounding + backproj methods? What depth do you use when you conduct back projection? Thank you very much!

The back projection follows the paper ScanRefer. This operation uses depth information to project 2D bounding boxes into 3D space. The depth information comes directly from the ground truth in the KITTI's point cloud data.

yfthu commented 3 months ago

Hello, thank you for your work. In your paper, you provide many 2D grounding + backproj methods as baselines, such as ReSC + backproj, TransVG + backproj. I would like to know how to derive 3D bounding boxes in these 2D grounding + backproj methods. Do you predict depth in these 2D grounding + backproj methods? What depth do you use when you conduct back projection? Thank you very much!

The back projection follows the paper ScanRefer. This operation uses depth information to project 2D bounding boxes into 3D space. The depth information comes directly from the ground truth in the KITTI's point cloud data.

Thank you very much for your answer!