PDillis / stylegan3-fun

Modifications of the official PyTorch implementation of StyleGAN3. Let's easily generate images and videos with StyleGAN2/2-ADA/3!
Other
230 stars 36 forks source link

Render Video of the Internal Representations using gen_video.py #21

Closed nuclearsugar closed 2 years ago

nuclearsugar commented 2 years ago

When using the StyleGAN3 interactive visualization tool, you can checkmark specific nodes to visualize the internal representations of the model. Here is an example - https://github.com/NVlabs/stylegan3/blob/main/docs/stylegan3-teaser-1920x1006.png https://nvlabs-fi-cdn.nvidia.com/_web/stylegan3/videos/video_8_internal_activations.mp4

But it is possible to specify and visualize these nodes using the gen_video.py script? I'm using StyleGAN3 within Google Colab and would like to render out video of the internal representations of a specific sequence of seeds.

Also, thank you for releasing this amazing fork! I've been using it to train very small datasets (500 to 1500 images) and so the added mirrorY attribute has been useful, along with the "stabilize-video" attribute too. Here is some of my projects if you're curious.

PDillis commented 2 years ago

Great seeing the repo is of use! Yes, making interpolations with the internal representations is actually the first on my TODO list at the bottom, but now that you mention it, I have no idea why they didn't release this particular code before. Also with the stabilization, they had it and just let us go like madmen trying to find the solution.

Anyways, I'll bump up the internal representation interpolations, as you're not the first to ask for it (also I want it to be done lol). Will update when I have it!

nuclearsugar commented 2 years ago

Thanks very much! Do you have a Patreon?

Indeed strange they didn't share the internal representation render code since they use it within their own demo videos.

PDillis commented 2 years ago

I don't have a Patreon, but I do have Github Sponsors, if you want to support me in updating this repo :)

https://github.com/sponsors/PDillis

nuclearsugar commented 2 years ago

Awesome, sponsored!

fbarretto commented 2 years ago

When using the StyleGAN3 interactive visualization tool, you can checkmark specific nodes to visualize the internal representations of the model. Here is an example - https://github.com/NVlabs/stylegan3/blob/main/docs/stylegan3-teaser-1920x1006.png https://nvlabs-fi-cdn.nvidia.com/_web/stylegan3/videos/video_8_internal_activations.mp4

But it is possible to specify and visualize these nodes using the gen_video.py script? I'm using StyleGAN3 within Google Colab and would like to render out video of the internal representations of a specific sequence of seeds.

Also, thank you for releasing this amazing fork! I've been using it to train very small datasets (500 to 1500 images) and so the added mirrorY attribute has been useful, along with the "stabilize-video" attribute too. Here is some of my projects if you're curious.

Nice videos, @nuclearsugar! Really enjoyed your loops. Have you generated 1:1 videos and then edited to the desired aspect ratio or they are generated at presented aspect ratio?

@PDillis I've been testing and using your repo for the last week and it is VERY useful. I wonder if there is a Discord channel or something dedicated to this repo, so it would be easier for us to help you improve the code.

Cheers!

nuclearsugar commented 2 years ago

@fbarretto - Thanks! All of the StyleGAN related video loops are trained and rendered out with a 1:1 aspect ratio. But the compilation edit shown on Youtube is cropped to be 16:9 since makes better use of the typical computer monitor. Although the torrent file contains the 1:1 2048x2048 60fps videos.

PDillis commented 2 years ago

@nuclearsugar f474b9e (latest push in main) should let you generate images or random videos with the internal representations. An example of basic usage:

python generate.py random-video --network=ffhq1024 --cfg=stylegan3-t --seeds=0 --trunc=0.7 --anchor-latent-space \\
    --layer=L11_1044_51 --rgb=True --compress --duration-sec=10.0

Result:

https://user-images.githubusercontent.com/24496178/183295797-60b1e447-5a3d-4f81-b590-b73eba189d71.mp4

If you want to know which layers are available on your model, add the flag `--available-layers.

Another example with --cfg=stylegan3-r and --layer=L11_1044_102:

https://user-images.githubusercontent.com/24496178/183297480-1bf03522-3c0a-45ad-b1f6-79f5178c8cd2.mp4

You can save the images as grayscale (--grayscale=True) or as RGB (--rgb=True). In short, --rgb=True will save the image as an RGB using 3 consecutive channels at the selected layer, starting from --base-channel=0 (using indexing). Only one seed/image can be generated and it's a bit slower than normal image synthesis. Generating images (python generate.py images) lets you save the image as RGBA (--rgba=True), using 4 consecutive channels. Check the rest of the params in the code or via --help (or ask away).

@fbarretto A Discord server sounds interesting! For the moment, I don't have anything but will start making one and then adding the link to the README.

nuclearsugar commented 2 years ago

Wow very exciting! Looks nicely documented too. Thanks for working on this.

I've been trying to get this to work in Google Colab and keep hitting the same issue. I've tried a few different instances on Google Colab just to make sure. I'm embarrassed that it's a basic error and yet I just cannot overcome it. Is this repo required to use a specific version of the imageio-ffmpeg library or am I missing something else?

Prior to running the generate.py script, I execute !pip install imageio-ffmpeg... And yet I'm seeing the following error:

Traceback (most recent call last):
  File "generate.py", line 18, in <module>
    import moviepy.editor
  File "/usr/local/lib/python3.7/dist-packages/moviepy/editor.py", line 26, in <module>
    imageio.plugins.ffmpeg.download()
  File "/usr/local/lib/python3.7/dist-packages/imageio/plugins/ffmpeg.py", line 38, in download
    "imageio.ffmpeg.download() has been deprecated. "
RuntimeError: imageio.ffmpeg.download() has been deprecated. Use 'pip install imageio-ffmpeg' instead.'

Below is the Colab Notebook that I've been using, if you'd like to easily try it yourself. https://colab.research.google.com/drive/1wf5geso52gSKvwggwgyWCFhznTtl8omd?usp=sharing

PDillis commented 2 years ago

By this thread, the problem comes from imageio itself. For what it's worth, I have imageio 2.9.0 in my Windows laptop and everything works, so you could try with that. Otherwise, I just generated the video in the notebook by running

!pip install imageio==2.4.1

Note that if you want to compress the video (adding --compress), you need ffmpeg-python installed, so also run this before generating the video:

!pip install ffmpeg-python

I haven't seen if my code runs or doesn't run on Colab notebooks, so thank you for getting me started at least!

nuclearsugar commented 2 years ago

Thanks so much, that fixed it. I'm gonna have so much fun exploring the internal representations of the different models that I've been training over the last 10 months.

Heads up, in my initial testing I rendered out a video and it resulted in a single frozen frame for the entire video. Eventually then I realized that having this attribute set below 10.0 was the culprit: --duration-sec=5.0. Not sure if that's a bug.

nuclearsugar commented 2 years ago

I'm trying to understand at a conceptual level of what's being visualized here. Is the following correct? We select a single node in the neural network to visualize, determine 3 channels from the many to visualize, arbitrarily map those channels to RGB, and then perform a latent vector walk on a single seed.

Would it be possible to do a latent seed walk while visualizing an internal representation node, or is that a heavy request?

Also I'm been exploring the available attributes and not seeing much change when adjusting --noise-mode. Mind sharing a little info about it?

PDillis commented 2 years ago

So two things: random-video will generate purely random latents, then smooth them out using a Gaussian blur, resulting in a loop. So, if you generate a short video, this loop will be quite small, resulting in a 'frozen video' (it really isn't, but quite hard to tell).

Now, your understanding is correct. The seed in question is simply how you start the random generation of seeds I mentioned above, which will then be blurred/"averaged". A latent seed walk is possible via sightseeding.py, as I formally have latent walk to be something else in my personal code (basically, a walk between specific vectors and not seeds, I use it here).

I'll continue updating the code so more parts of it will have the internal representations available, like sightseeding.py above. Hope this helped!

nuclearsugar commented 2 years ago

Very helpful, thanks!