YoadTew / zero-shot-image-to-text

Implementation of Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic
262 stars 42 forks source link

About Eq.(5) in paper #9

Open 232525 opened 1 year ago

232525 commented 1 year ago

In paper, image where $\alpha = 0.3$ and the norm of gradients has a factor of 2? But in https://github.com/YoadTew/zero-shot-image-to-text/blob/main/model/ZeroCLIP.py stepsize = 0.3 but grad_norm_factor=0.9? Did I make a mistaken understanding?