microsoft / Cream

This is a collection of our NAS and Vision Transformer work.
MIT License
1.69k stars 230 forks source link

About the buckets in the IRPE #242

Open Zhong1015 opened 1 month ago

Zhong1015 commented 1 month ago

Hello @wkcn ,

I am still closely following your excellent work. Recently, I have been reflecting on the buckets parameter discussed in your paper. Since specific parameterization details were not provided in the paper, I consulted some AI-related resources. Based on initial findings, I believe the information appears to be reliable; however, I am unsure if there are any inaccuracies in the details, particularly regarding the formulas used for quantifying the last two parameters. I would greatly appreciate it if you could review and provide clarification.

Looking forward to your response. Image

wkcn commented 1 month ago

Hi @Zhong1015 , thank you for your continued support! : )

The definition you provided is correct.

The concept of bucket comes from the hash algorithm. A relative position (x1-x2, y1-y2) can be mapped to a bucket, and each bucket can correspond to multiple relative position.

As shown in our supplementary material, Image

The red star presents the reference position. Different color means different bucket. The relative positions with the same color share the same encoding.