About version of ClipVisionModel

tencent-ailab / IP-Adapter

The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.

Apache License 2.0

4.46k stars 289 forks source link

About version of ClipVisionModel #371

Closed Foolbee closed 1 month ago

Foolbee commented 1 month ago

Thanks for your excellent work! I have noticed that you only trained the adapter module with CLIP-H-14 instead of CLIP-L-14. Are there any reasons for you to select the more complicated one but not the simpler one？Or you have tried the simpler one but got a worse result？

xiaohu2015 commented 1 month ago

I find CLIP-H-14 is a little better than CLIP-L-14

Foolbee commented 1 month ago

Thanks! I am wondering if the weights file of the pretrained adapter model with openai/clip-vit-l-14 as image encoder is available now, since it will be of great benefit to my work.

xiaohu2015 commented 1 month ago

not available

Foolbee commented 1 month ago

Got it. Thanks for your reply!