zer0int / CLIP-ViT-visualization

What do CLIP Vision Transformers learn? Feature Visualization can show you!
4 stars 1 forks source link

quality of the inversion #1

Closed ariel415el closed 1 month ago

ariel415el commented 1 month ago

Hey, seems like you took inspiration from this repo Any idea why the inversions you show are way less sharp than the ones shown on the repo I mentioned?

zer0int commented 1 month ago

Hi there!

Yes, as cited in my readme.md, it is based on the repo you mentioned.

Regarding sharpness of the image, see my code comment:

    # coefficient=0.0005 -> sharp and noisy features; coefficient=0.005 -> balanced; coefficient=0.05 -> soft, blurry, muddy
    tv = 1.0
    coefficient=0.005

If you prefer very noisy (but sharp, not blurry), try 0.0000005 for the total variation (TV) coefficient, for example.