adaptivetokensampling / ATS

Adaptive Token Sampling for Efficient Vision Transformers (ECCV 2022 Oral Presentation)
https://adaptivetokensampling.github.io/
Apache License 2.0
86 stars 13 forks source link

Why is ATS differentiable? #5

Open kaikai23 opened 1 year ago

kaikai23 commented 1 year ago

Hi, may I ask why is the ATS differentiable?

In my understanding, because the CDF function (equation (4) in the paper) is piecewise constant, the inverse of CDF (equation (5) in the paper) is also piecewise constant and thus is not differentiable. Did I miss something?

Thank you in advance!

Andyyoung0507 commented 1 year ago

It uses the Gumble-softmax to make propagation come true.

nhw649 commented 1 year ago

It uses the Gumble-softmax to make propagation come true.

May I ask on which line of the code is it

lijun2005 commented 1 week ago

It uses the Gumble-softmax to make propagation come true.

May I ask on which line of the code is it

I am also confused about this.