isl-org / lang-seg

Language-Driven Semantic Segmentation
MIT License
706 stars 87 forks source link

Pretrained LSeg on Pascal-5i, COCO-20i #8

Closed ducminhkhoi closed 2 years ago

ducminhkhoi commented 2 years ago

Congrats on your paper accepted to ICLR 2022!

Do you have your pretrained models on 4 folds of Pascal-5i and COCO-20i? Can you share them?

I really appreciate your response.

Boyiliee commented 2 years ago

Hi @ducminhkhoi ,

Thanks for your interest in LSeg!

Yes, we will release the model. Please allow some time for me to sort them out (as well as the code). Should be before the main conference.

Hope this helps. Good luck with your research!

Feobi1999 commented 2 years ago

Regarding the label processing of voc and coco dataset or the processed label file, have you referenced some open source code? Can you give a reference link so I can use it directly? Thanks

Boyiliee commented 2 years ago

Hi @Feobi1999 ,

Thanks for your interest in LSeg!

This has been solved in #4 issue. Please check it for details.

Hope this helps!

Feobi1999 commented 2 years ago

Thanks for your response! Now I am trying to construct a zero shot baseline with your LSeg. Taking the VOC dataset as an example, I create a label_without_folder0.txt file and modify the corresponding path in the code. At the same time, I divided it according to the data division method of hsnet. I still have a question, should the label txt contain the name of 20 classes or 15 classes when I train? Because I am not sure should there be a corresponding relations between label names and the annotations? I see your paper mentioned "We provide the full label set that is defined by each training set to the text encoder for each image."
At the same time, how to deal with the ambigious class? Thanks

Boyiliee commented 2 years ago

Hi @Feobi1999 ,

Thanks for your question! For the zero-shot, we only use the training classes of HSNet and test on the testing classes. We highly recommend you could take a detailed look at their code. In general, pascal-5i means they will split the dataset into 4 folds, each containing 5 classes. They will train on each fold and test on other folds. We highly recommend you could read the setting and paper in detail.

Hope this helps!