Closed Jeff-LiangF closed 1 year ago
Hey @Jeff-LiangF,
I am completely with you on that, hopefully we will arrive to similar performances in 3D too! We tested some zero-shot experiments with categories not present in ScanNet200 and found that we can not really recognise completely new concepts, but it was flexible enough to work with labels with alternative names or similar meanings. So if there is an other dataset with similar data distribution, but different category labels I believe this method could transfer knowledge - although we haven't explicitly tested it.
It would be super interesting experiment.
Kind regards, David
Hi @RozDavid,
Thanks for sharing your observations. When we were doing 2D open-vocab segmentation, we also found the model struggled to recognize never seen concepts.
Yes, I totally agree a general open-vocabulary 3D model may have a huge potential impact in the area!
Hi @RozDavid,
Thank you so much for your great work! One major advantage of the language-grounded model is that it can perform zero-shot evaluation on other datasets, such as 2D open-vocab segmentation. Just curious whether the authors have explored this direction.