Analyzing Features of CLIP Image Encoder

Hi, the analyses of CLIP image encoder are quite straightforward. First, synthesizing noisy images from clean ones (using Gaussian or poison noise). Then, directly send these noisy and clean images to the CLIP ResNet encoder, without crop, resize or normalization as done in the original CLIP preprocessing. Finally, obtain the dense features of noisy images and clean images from CLIP ResNet encoder, respectively, and compute their similarities (using e.g., cosine distance or CKA similarity).

alwaysuu / CLIPDenoising

Analyzing Features of CLIP Image Encoder #2