report bug - Githubissues

KU-CVLAB / CAT-Seg

Official Implementation of "CAT-Seg🐱: Cost Aggregation for Open-Vocabulary Semantic Segmentation"

MIT License

247 stars 25 forks source link

Hi, thanks for your great work!

When I look through the code, I found in the inference phase, the clip image encoder is forwarded twice, is this a bug here or why is it forwarded twice?

https://github.com/KU-CVLAB/CAT-Seg/blob/3062d4abda7884f35ff8650784c882b225783978/cat_seg/cat_seg_model.py#L202

https://github.com/KU-CVLAB/CAT-Seg/blob/3062d4abda7884f35ff8650784c882b225783978/cat_seg/cat_seg_model.py#L205

Besides, the main difference between the CVPR version and the previous arxiv version is that you remove the additional backbone(Swin) and managed to finetune the CLIP text encoder, am I right?

KU-CVLAB / CAT-Seg

report bug #20