orpatashnik / StyleCLIP

Official Implementation for "StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery" (ICCV 2021 Oral)
MIT License
3.98k stars 561 forks source link

Difficulty producing convincing results. #15

Closed stolk closed 3 years ago

stolk commented 3 years ago

I got this result with:

python3 main.py --description "Blue Hair" --mode=edit

image

Do I need to change something in the invocation to get better results?

orpatashnik commented 3 years ago

Hi @stolk ,

Thanks for your interest in our project ☺️

As written in the paper, using optimization to edit images can be a bit tricky sometimes. Moreover, in the current version in the repo, the identity loss which can be helpful to obtain better results is not implemented.

We plan to update the optimization code, and upload the code of the mapper and the global directions (which are more stable) within the next few days. The waiting will be worth πŸ˜‰

Until then, I would try to lower a bit --l2_lambda.

ModMorph commented 3 years ago

The other thing @stolk stolk is that FFHQ model has limited dataset diversity, both in terms of human diversity (less balance, hence when I found C-list movie stars yesterday with CLIP who are black/multiracial, often they look to be whiter in the retrieved latent) meaning that it's harder to find in the latents what the GAN has limited training data for (e.g. hence giving you that little slab of blue background). By getting a broader training set and extending FFHQ in stylegan2 ADA and then using an improved model with StyleCLIP much, much more is possible. (But obviously much more work unless you have one someone is willing to share with you an extended model.) Without going that far you can also try engineering the description phrase to emphasize the desired result (see original clip project on prompt engineering), but that's hit or miss.

I think I saw similar problem trying purple hair yesterday...e.g. got a purple blotch on the face...versus intent. Haven't tried any hyperparameter tuning, so helpful tip, thanks @orpatashnik. Other CLIP+GAN projects I've seen, try to first select the closest matching start seed to a prompt before beginning to transverse the latents. However, adds to search time. I have created conditional GANs that give extra levers to control output as well, although I haven't seen as much of that in CLIP+GAN projects since theoretically it could be redundant (look at WikiArt if you want an example of how to adjust stylegan2 - great thing about conditioning is it helps to mix style categories and get closer in the latent neighborhood you're looking for.). cheers!

image

stolk commented 3 years ago

Until then, I would try to lower a bit --l2_lambda.

Thanks, @orpatashnik, with --l2_lambda=0.004 the results are much better! image

orpatashnik commented 3 years ago

@ModMorph - Thanks for your comprehensive comment. I agree that StyleGAN can limit the expressiveness of our method. However, we found that using the driving text it is possible to generate images that are not so common when generating an arbitrary image using StyleGAN. Some examples are the purple hair, and the unique expression shown in the paper.

@stolk Did you try our new GUI? 😎

stolk commented 3 years ago

Still wrestling with conda.

UnsatisfiableError: The following specifications were found to be incompatible with the existing python installation in your environment:

Specifications:

Your python: python=3.8

On Sat, Apr 3, 2021 at 8:19 AM orpatashnik @.***> wrote:

@ModMorph https://github.com/ModMorph - Thanks for your comprehensive comment. I agree that StyleGAN can limit the expressiveness of our method. However, we found that using the driving text it is possible to generate images that are not so common when generating an arbitrary image using StyleGAN. Some examples are the purple hair, and the unique expression shown in the paper.

@stolk https://github.com/stolk Did you try our new GUI? 😎

β€” You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/orpatashnik/StyleCLIP/issues/15#issuecomment-812879395, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADPE4CPBNIZRGF4KZQV64DTG4WYDANCNFSM42INY5KA .

-- Owner/Director of Game Studio Abraham Stolk Inc. Vancouver BC, Canada @.***

orpatashnik commented 3 years ago

Seems like your environment is python 3.8. Please try to create an environment with python 3.7

orpatashnik commented 3 years ago

Can I close the issue? 😊

stolk commented 3 years ago

Thanks for the tips.