More details about Feature Agg

KU-CVLAB / CAT-Seg

Official Implementation of "CAT-Seg🐱: Cost Aggregation for Open-Vocabulary Semantic Segmentation"

MIT License

247 stars 25 forks source link

I noticed that Feature Agg and Cost Agg were compared in the paper, and cost Agg performed better. The paper says that the difference between them is that the features used during aggregation. It's that means Feature Agg only use the highest level features, while Cost Agg also use an additional cost?

"For both of baseline architectures, we simply apply the upsampling decoder and note that both methods share most of the architecture, but differ in whether they aggregate the concatenated features or aggregate the cosine similarity between image and text embeddings of CLIP." from your papers.

Thanks in advance!

KU-CVLAB / CAT-Seg

More details about Feature Agg #21