mlfoundations / open_clip

An open source implementation of CLIP.
Other
9.85k stars 955 forks source link

How can I decode the image feature to RGB-image #674

Closed 1216537742 closed 11 months ago

1216537742 commented 11 months ago

I want to use the image feature to do some downstream tasks (anomaly detection), could I decode the reconstructed feature to image like an autoencoder? I'm new to this and would really appreciate some simple guidance!

rwightman commented 11 months ago

this is more of a q&a discussion than a bug/issue. CLIP models encode image or text into an aligned embedding space, going from embedding -> image is a different class of models, you'd want to look at diffusion or other generative image models and how one can leverage CLIP embeddings to guide the generation