wusize / CLIPSelf

[ICLR2024 Spotlight] Code Release of CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction
https://arxiv.org/abs/2310.01403
Other
170 stars 9 forks source link

Request for Window Attention Weights and Code as Referenced in Table 7 #7

Closed cilinyan closed 9 months ago

cilinyan commented 9 months ago

Dear Authors,

I've been closely examining the details presented in Table 7 of your paper, where you compare the performance of Global Attention and Window Attention mechanisms. It appears that the main branch primarily provides the weights and code for the Global Attention version. I'm particularly interested in exploring the Window Attention variant further.

Would it be possible for you to share the weights and code specific to the Window Attention implementation mentioned in Table 7?

Thank you very much for your assistance and for the invaluable contributions your work provides to the field.

Best regards.

cilinyan commented 9 months ago

I noticed that the authors seem to have provided the code for the Window Attention version in the window_attn branch. However, it appears that a checkpoint has not been provided. I hope the authors could share this part as well. Thank you very much!

wusize commented 9 months ago

Hi, I idiotically deleted the checkpoints when I was clearing my disk space. I will reproduce and then update them to you. You can also reproduce them by directly running the scripts ended with 'window_attention'.

cilinyan commented 9 months ago

thank you for your response!