anvoynov / GANLatentDiscovery

The authors official implementation of Unsupervised Discovery of Interpretable Directions in the GAN Latent Space
420 stars 51 forks source link

What is your GPU memory when using StyleGANv2 of model-f in 1024 resolution? #19

Closed ChengBinJin closed 3 years ago

ChengBinJin commented 3 years ago

@anvoynov I want to try training using StyleGANv2 on W space. But it's always out of memory even I tried the different parameters. From your readme, it seems that you successful to run StyleGANv2 of model-f with 1024 resolution.

chi0tzp commented 3 years ago

@anvoynov If I may add to @ChengBinJin's question, could you also provide some insight on the number of GPUs used for BigGAN and ProgGAN experiments? In the paper you state that "all the experiments were performed on the NVIDIA Tesla v100 card", but how many of them in parallel? Because it seems that ProgGAN, especially, needs a lot of VRAM (my quick calculations give about 40 GB for batch_size=10).

ChengBinJin commented 3 years ago

@chi0tzp @anvoynov I tried to use V100 with 16Gb for the config-f of StyleGANv2-1024 resolution. It always shows out of memory. Therefore, I plan to try 512x512 instead of 1024x1024. I will leave feedback after trying it.

anvoynov commented 3 years ago

We have launched it on a single Tesla V100 with 32GB and batch size 16 (10 for ProgGAN). For the purpose to reduce the memory utilization, consider passing the shift_predictor_size = 256 to force the generated images to be downscaled before the shift prediction.

chi0tzp commented 3 years ago

@anvoynov Thanks for letting us know. Personally I don't have access to a 32GB V100, but to a pair of 16GB V100s. I hope this doesn't affect much the experiment (at least in terms of VRAM usage).

chi0tzp commented 3 years ago

Just to let you know (mainly @ChengBinJin ), the maximum batch size the I could use for training the ProgGAN model with two 16GB V100 cards is 6. In the code, the only model that is parallelized into multiple GPUs is the generator G; I don't know what would happen if I parallelize the other models as well (basically the shift predictor). Apparently, this is not an issue when you have a total VRAM of 32GB, instead of two of 16GB. I suggest that this issue can be closed now (@ChengBinJin's and @anvoynov's call), but I will get back and let you know if I decide to parallelize shift predictor as well.