a question about expected training time

autonomousvision / stylegan-xl

[SIGGRAPH'22] StyleGAN-XL: Scaling StyleGAN to Large Diverse Datasets

MIT License

961 stars 113 forks source link

a question about expected training time #36

Closed Youzebin closed 2 years ago

Youzebin commented 2 years ago

hello, xl-sr. thank you for your great project. when i train the stem, although the resolution is 16, but when i train it on 1-v100, batch_size = 8, got 120sec/kimg is it right?? but when i train the same zip using stylegan2-ada, batch_size = 32, got 8sec/kimg. maybe the stylegan-xl is too slow??? can you answer my question? thank you very much. I really need your help

Youzebin commented 2 years ago

can you release your expected training time for us? i think this will help us.

Youzebin commented 2 years ago

I have another question, you mentioned in your paper that it is best to set the batch_size to 2048 when the resolution is between 16 and 64, but when I set the batch_size to 32, 2 gpu's and a separate batch_size of 16, it already shows that a single GPU with 16 video memory is not enough. Is it possible that the batch_size you mentioned in your paper requires too much video memory?

xl-sr commented 2 years ago

StyleGAN-XL is larger than StyleGAN2/3. If you want to have the same model size, you can pass:

--cbase 16384 --cmax 256 --syn_layers 4

and for superresolution stages:

--head_layers 4

Re. batch size. If you train with a single GPU and a batch size of 2048, you do this via:

--batch=2048 --batch-gpu 16

This way you do batch accumulation, ie., accumulate gradients 128 times to get to a final batch size of 2048. If you have only a single GPU available, you can also start with a batch size of 256, the results won't be much worse. Also, for smaller / less diverse datasets, you can directly start with much smaller batch sizes.

Youzebin commented 2 years ago

thank you for your help