isl-org / lang-seg

Language-Driven Semantic Segmentation
MIT License
706 stars 87 forks source link

bad results with the provided checkpoint #12

Closed Zacchaeus00 closed 2 years ago

Zacchaeus00 commented 2 years ago

Screenshot from 2022-03-04 00-27-07

Boyiliee commented 2 years ago

Hi @Zacchaeus14 ,

Thanks for your feedback!

Actually, this has been discussed in our paper section 5.1. It seems like you didn't input all the labels in the model. In this way, assume the model will be biased towards the label that might be related in the text embedding space (such as 'road' and 'car'). I guess the result will turn out to be better if you input the label 'road'.

Also actually, we don't train with 'other' too much, but we are a little surprised by LSeg's good generalizability on 'others'. However, as has been mentioned, LSeg provides efficient multimodal modeling and would like to provide insights for more brilliant ideas and works that are inspired or on the basis of LSeg.

Hope this helps!

Best, Boyi

Zacchaeus00 commented 2 years ago

@Boyiliee Thanks so much!