Can't understand the design of Δfi

w1oves / Rein

[CVPR 2024] Official implement of <Stronger, Fewer, & Superior: Harnessing Vision Foundation Models for Domain Generalized Semantic Segmentation>

https://zxwei.site/rein

GNU General Public License v3.0

250 stars 21 forks source link

Can't understand the design of Δfi #25

Closed wqfdewifi closed 6 months ago

wqfdewifi commented 6 months ago

Sorry for my poor comprehension ability, I cant understand the design of this , even though I've read the paper many times. I'm curious if removing the the first column of Si and first row of Ti achieves the same result as removing any column of Si and any row of Ti. I can't understand why it has to be the first one.

I'd appreciate it if you could explain it to me.

w1oves commented 6 months ago

Thank you for your interest in my work! I apologize for any confusion caused by my unclear explanation, wasting your time. Removing any column can achieve a result similar to removing the first column; I merely chose the first column as it was the easiest option. The key is to remove one column so that the total sum of the similarity matrix S does not equal 1. This idea was inspired by Vision Transformers Need Registers. If you have any further questions, please feel free to ask.

wqfdewifi commented 6 months ago

I'm clear now, thank you！