Can Not Find Pretrained CLIP Implementation

Yuxinn-J / Scenimefy

[ICCV 2023] Scenimefy: Learning to Craft Anime Scene via Semi-Supervised Image-to-Image Translation

Other

263 stars 17 forks source link

Hi, thanks for saying that I'm glad you liked it. Regarding your question, yes, we use the pretrained CLIP model in the first stage to help preserve content while fine-tuning the StyleGAN to generate pseudo paired data. As for the PatchNCE loss, we actually extract features directly from the GAN generator itself, rather than using CLIP. You can find the relevant code here.

Given the impressive generative capabilities of diffusion models nowadays, an alternative approach could be to use a diffusion model to generate pseudo paired data in the first stage instead of fine-tuning StyleGAN. This might yield better results for supervising GAN training.

Yuxinn-J / Scenimefy

Can Not Find Pretrained CLIP Implementation #14