microsoft / Cream

This is a collection of our NAS and Vision Transformer work.
MIT License
1.66k stars 225 forks source link

About the effects of IRPE #221

Closed Zhong1015 closed 6 months ago

Zhong1015 commented 8 months ago

The IRPE project is a very good initiative. Currently, I have applied IRPE and observed improvements in the model. However, I have noticed that when I solely apply positional encoding on 'k,' there is a certain improvement in performance. Yet, when I simultaneously use IRPE on 'qkv,' there seems to be a decline in performance. It's worth mentioning that my 'qkv' all come from the same source image features. I would like to know if this is reasonable and why the simultaneous application of IRPE on 'qkv' is not as effective as applying it only on 'k'?

wkcn commented 8 months ago

Hi @Zhong1015, thanks for your attention to our work!

I wonder which vision task the model handles, and which evaluation metric is used.

Zhong1015 commented 8 months ago

Thank you for your response.@wkcn. I am currently working on a multi-label image classification task, and the specific evaluation metric is mAP (mean average precision). For each class, there is a separate average precision value, and the mAP is obtained by averaging these values.

wkcn commented 8 months ago

@Zhong1015 What are the mAP of the baseline and the experiments equipped iRPE on k and qkv?

The iRPE on qkv may have little improvement. You can conduct multiple experiments to avoid random error.

Zhong1015 commented 6 months ago

I have solved the problem,thank you!