Pointcept / OpenIns3D

[ECCV'24] OpenIns3D: Snap and Lookup for 3D Open-vocabulary Instance Segmentation
MIT License
135 stars 9 forks source link

Troubleshooting Compatibility Issues in OpenIns3D's lookup.py with ShapeNet Dataset #8

Closed hdddhdd closed 2 months ago

hdddhdd commented 5 months ago

Hello, first of all, thank you for developing a good model.

I'm trying to run this model on shapenet data. I got the following problem while trying to run it. image

I think it's because the main function of lookup.py is not compatible with the shapenet dataset. image

I wonder if you know anything about the solution.

ZheningHuang commented 5 months ago

Hi, we have not tested this on part segmentation benchmarks, but from your screenshot, it seems that you have already generated snap images. Have you generated masks for each part and used a 2D detector to label the snap images?

It would be useful to know about this before providing any guidance.

hdddhdd commented 5 months ago

The snap image was created, but I don't think the mask was created for each part in the result_vis_2d folder like the example image below. Can you tell me where the problem is? Thank you.

image

hdddhdd commented 5 months ago

Additionally, my snap shot image is like image below. image

ZheningHuang commented 5 months ago

Thanks again for sending the pictures. Hmm, I am slightly confused here:

  1. Our MPM is based on Mask3D, and the pretrained weights are for scene-level object instance segmentation. Segmenting objects into different class-agnostic parts will require modification to the mask module. Do you have any available tools to perform this segmentation?

  2. You also need a 2D model that can segment object parts well, which I doubt ODISE/LISA could accomplish.

  3. Your 2D image also does not have color, which is a bit weird, as ShapeNet 3D meshes do have color.

The key concept of OpenIns3D is to render images, use a more powerful 2D model to learn the segmentation as desired, and then transfer it to 3D with the lookup approach. I suggest you reevaluate if your proposed task can be done in this manner. We have not tested OpenIns3D on part segmentation yet.

hdddhdd commented 5 months ago

I solved the problem with your help. Thank you.

I have one more question. When I executed the code with this command, ''' python zero_shot.py --pcd_path '../makemesh/previous/generated_scene/scene_final_0.ply' --vocab "chair; window; ceiling; picture; floor; lighting; table; cabinet; curtain; plant; shelving; sink; mirror; stairs; counter; stool; bed; sofa; shower; toilet; TV; clothes; bathtub; blinds; board" --dataset mattarport3d '''

the txt information appears as below, ''' pred_mask/000.txt 5 1.0 pred_mask/003.txt 1 0.7385795263445752 pred_mask/004.txt 7 0.6194960494516476 pred_mask/006.txt 8 0.9272187798968775 pred_mask/008.txt 9 1.0 pred_mask/011.txt 17 1.0000000000000002 pred_mask/012.txt 8 0.5186039635263331 pred_mask/015.txt 6 1.0 pred_mask/018.txt 21 1.0 pred_mask/021.txt 1 1.0 ''' and I think it is mask text file name | class number | accuracy. Is this correct?

And if the middle information is the class number, I would like to know what data the class number is based on.

hdddhdd commented 5 months ago

If possible, could you explain the information in this text file? ''' pred_mask/000.txt 5 1.0 pred_mask/003.txt 1 0.7385795263445752 pred_mask/004.txt 7 0.6194960494516476 pred_mask/006.txt 8 0.9272187798968775 pred_mask/008.txt 9 1.0 pred_mask/011.txt 17 1.0000000000000002 pred_mask/012.txt 8 0.5186039635263331 pred_mask/015.txt 6 1.0 pred_mask/018.txt 21 1.0 pred_mask/021.txt 1 1.0 '''

ZheningHuang commented 2 months ago

This text was originally the output format for the submission of the ScanNet benchmark. The first column is the binary mask file's locations, the second row is the classification index, and the last row is the confidence score.

We have also just released a newer version of OpenIns3D, which is better designed for use. Best, Zhening