How to use StyleCLIP on a StyleGAN2 that was trained on a custom, non-FFHQ dataset?

orpatashnik / StyleCLIP

Official Implementation for "StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery" (ICCV 2021 Oral)

MIT License

4k stars 560 forks source link

How to use StyleCLIP on a StyleGAN2 that was trained on a custom, non-FFHQ dataset? #75

Open sarmientoj24 opened 3 years ago

sarmientoj24 commented 3 years ago

Questions:

I cannot seem to find any explicit instructions on how to use/train StyleCLIP on a trained StyleGAN2 (custom dataset, non-human face). Could you elaborate on this process? How do I incorporate the trained SG2?
Do I need to still parts of the StyleCLIP for this process or can I use it outright?

orpatashnik commented 3 years ago

Hi @sarmientoj24 ,

Which StyleCLIP method are you trying to use?

sarmientoj24 commented 3 years ago

Hi @orpatashnik I am using all three methods. It seems like I need something similar to ArcFace for non-faces in order to get IdentityLoss.

What alternative did you use for non-FFHQ/non-faces (most esp method 1 and method 2)? Right now, I cannot use it on my trained StyleGAN2 model. I could generate images but they do not really have the description. For example, I had "a car with red wheels" but there isn't really any difference between the original and the one with StyleCLIP.
How do I train Method 2 with non-faces?
How can I start latent optimization from a starting image that I generated from my own StyleGAN2.

orpatashnik commented 3 years ago

Hi @sarmientoj24 ,

Regarding the ID loss - I think that using only the L2 should be enough for both the optimization and mapper. Human are usually most sensitive to human faces so in that domain we take extra effort in preserving the identity.

Regarding no observed changes with the optimization - did you try playing with the hyperparams? This method is sometimes sensitive to the hyperparams.

Actually, I didn't try the mapper method with other domains, but I think it is not supposed to be so hard to adapt the code. You need to change the StyleGAN resolution, and to cancel the identity loss. Did any further problm occured?

If the image was generated by StyleGAN you can save the latent code and give it as the "--latent_path" argument.

sarmientoj24 commented 3 years ago

I had this with "a bus with square wheels" with the default parameter and 300 epochs. Nothing really changed.

sarmientoj24 commented 3 years ago

Also, how do i save the latent code?

49xxy commented 2 years ago

我有这个带有默认参数和 300 个 epoch 的“带有方形轮子的公共汽车”。没有什么真正改变。

May I ask whether your problem has been solved? I have also been studying the combination of Clip and non-face generation model recently.