ch3cook-fdu / Vote2Cap-DETR

[CVPR 2023] Vote2Cap-DETR and [T-PAMI 2024] Vote2Cap-DETR++; A set-to-set perspective towards 3D Dense Captioning; State-of-the-Art 3D Dense Captioning methods
MIT License
76 stars 5 forks source link

There is no 'detector_Vote2Cap_DETRv2' in current version #10

Closed jkstyle2 closed 1 month ago

jkstyle2 commented 4 months ago

According to the demo.sh, it seems there is a new version of detector 'detector_Vote2Cap_DETRv2'. but it seems not exist in your current git. Could you tell me what's the new version and are you planning to update it?

ch3cook-fdu commented 4 months ago

Thanks for your interest in our projects. The code should be available now.

jkstyle2 commented 4 months ago

Thanks for your great support ! I also wonder if there is any debugging code to visualize 3d box and captioning results with G.T Box and G.T captions as below. image image

I have difficulty in understanding what the output json files refer. Do you have the debugging code or should I look into ScanRefer/Scan2Cap projects?

ch3cook-fdu commented 4 months ago

Actually, we use the code from https://github.com/ch3cook-fdu/3d-pc-box-viz to generate .ply files for each instance prediction (each saved with captions as filenames), and visualize the bounding boxes and the 3D mesh with meshlab.

The captions are manually assigned to each instance with PowerPoint.

jkstyle2 commented 4 months ago

Would you mind sharing how to generate .ply files with the code https://github.com/ch3cook-fdu/3d-pc-box-viz ? I could generate a json file as below, but have no idea how to use it. In the project (3d-pc-box-viz), there are many functions implemented in o3d_helper and ply_helper. Which function should I apply ?

{ "estimated_object_id": 30, "caption": "sos there is a rectangular door . it is to the right of the room . eos", "box": [ [ 0.8843369483947754, 3.859482765197754, -0.027031898498535156 ], [ 0.8843369483947754, 3.5344505310058594, -0.027031898498535156 ], [ -0.5504608154296875, 3.5344505310058594, -0.027031898498535156 ], [ -0.5504608154296875, 3.859482765197754, -0.027031898498535156 ], [ 0.8843369483947754, 3.859482765197754, 2.1553826332092285 ], [ 0.8843369483947754, 3.5344505310058594, 2.1553826332092285 ], [ -0.5504608154296875, 3.5344505310058594, 2.1553826332092285 ], [ -0.5504608154296875, 3.859482765197754, 2.1553826332092285 ] ], "sem_prob": [ 6.876472014027968e-08, 4.1737038714018126e-08, 5.879649833673284e-08, 2.1764801072521323e-09, 1.6125123458721191e-09, 0.914192259311676, 0.0074754999950528145, 1.50646698671153e-07, 2.055813439483245e-07, 6.176965883231134e-11, 3.1934419553181215e-09, 2.621668500069063e-05, 5.901231858729261e-08, 2.292609764253939e-07, 2.2458974957562106e-10, 1.706238372811697e-09, 1.0863445254472026e-08, 4.163456992500869e-07 ], "obj_prob": [ 0.07830482721328735, 0.9216951727867126 ] },

ch3cook-fdu commented 4 months ago

The pseudo code for the visualization is:

from ply_helper import write_bbox

for idx, item in enumerate(pred_per_scene):
    write_bbox(item['bbox'], color, str(idx) + '-' + item['caption'] + '.ply')

Then open the generated .ply files with the input 3D scene in meshlab.

ch3cook-fdu commented 4 months ago

If you are having further trouble generating the visualization files, please refer to: https://github.com/ch3cook-fdu/Vote2Cap-DETR/issues/11#issuecomment-1962851659 .