Closed Zzhy2000 closed 3 months ago
Hi, the analyses of CLIP image encoder are quite straightforward. First, synthesizing noisy images from clean ones (using Gaussian or poison noise). Then, directly send these noisy and clean images to the CLIP ResNet encoder, without crop, resize or normalization as done in the original CLIP preprocessing. Finally, obtain the dense features of noisy images and clean images from CLIP ResNet encoder, respectively, and compute their similarities (using e.g., cosine distance or CKA similarity).
Thanks for your work! Hello, can you provide detailed information on this part of the experiment