Pointcept / OpenIns3D

[ECCV'24] OpenIns3D: Snap and Lookup for 3D Open-vocabulary Instance Segmentation
MIT License
125 stars 8 forks source link

issues with own scenes and vocab #6

Closed rolfstarke closed 3 weeks ago

rolfstarke commented 4 months ago

Dear

thank you for the interesting model! i managed to run the examples of testing, but once i start to run it on my own scenes or change the vocab i get incorrect results. what could be the reasons for this?

this is an example where i changed just the vocab for the scannet example like this:

python zero_shot.py --pcd_path 'demo/demo_scene/scannet/scannet_scene1.ply' --vocab "floor; wall; beam; column; window; door; furniture; board"

openworld_instance_seg_result_4

thank you for your time

ZheningHuang commented 4 months ago

Hi,

I believe the issue with the proposed setting is that the vocabulary you presented contains many vague or generic verbs.

For instance, "furniture" and "board" are very generic and could represent many things, making them unfriendly for a 2D detector like ODISE. [For this reason, many generic vocabularies are eliminated in an open-world setting.] I would suggest trying a better selection of vocabulary for testing. Since this scene is from ScanNet, testing it with 17 classes (i.e., removing other furniture) should yield reasonable performance.

Best, Zhening

ZheningHuang commented 3 weeks ago

We are closing this issue for now.

Feel free to check out the newer version of the code, which is optimized to reproduce results and works for zero-shot inference.

Best, Zhening