RozDavid / LanguageGroundedSemseg

Implementation for ECCV 2022 paper Language-Grounded Indoor 3D Semantic Segmentation in the Wild
98 stars 14 forks source link

Zero-shot evaluation on other datasets, e.g., NYU, SUN, etc. #10

Closed Jeff-LiangF closed 1 year ago

Jeff-LiangF commented 1 year ago

Hi @RozDavid,

Thank you so much for your great work! One major advantage of the language-grounded model is that it can perform zero-shot evaluation on other datasets, such as 2D open-vocab segmentation. Just curious whether the authors have explored this direction.

RozDavid commented 1 year ago

Hey @Jeff-LiangF,

I am completely with you on that, hopefully we will arrive to similar performances in 3D too! We tested some zero-shot experiments with categories not present in ScanNet200 and found that we can not really recognise completely new concepts, but it was flexible enough to work with labels with alternative names or similar meanings. So if there is an other dataset with similar data distribution, but different category labels I believe this method could transfer knowledge - although we haven't explicitly tested it.

It would be super interesting experiment.

Kind regards, David

Jeff-LiangF commented 1 year ago

Hi @RozDavid,

Thanks for sharing your observations. When we were doing 2D open-vocab segmentation, we also found the model struggled to recognize never seen concepts.

Yes, I totally agree a general open-vocabulary 3D model may have a huge potential impact in the area!