Open gegogi opened 2 years ago
If you used the 'auto'
config, it will use 2 mapping layers by default. This can be seen at the beginning of training when you transferred from FFHQ256, as the whole images must've been 'pinkish' and with some weird expressions, as the code was adapting the 8 mapping layers of the pretrained model into the 2 mapping layers of the new one you trained.
I also noticed today that the Auto config seems to be hard coded to have generator layers = 2. I thought that the Auto spec values were all dynamically configured but this seems not to be the case for layers? Wondering why , seems like a documentation error at least as i was under the impression that auto was recommended for custom datasets and all the config values would be dynamically determined if auto was chosen. ... As i am training custom datasets i have been exclusively using Auto. Now that i have noticed this i am wondering if my model could benefit from a higher number of mapping layers ... It seems there is no specific override for the layers spec value, i would either have to use one of the non auto configs or modify a local copy of the code to expose layers specifically, is that correct? I can do that if the extra layers would be of benefit , Any guideline on the benefit / non benefit of extra mapping layers? Fwiw i am training custom dataset with resolution 2048x2048
If i decide to add mapping layers should i start training all over from scratch rather than resume training from a pkl snapshot created with only 2 layers?
hmm ... so i just tried to change to the "Stylegan2" cfg which specifies 8 mapping layers but now i get an OOM on colab pro+ . Not sure if the OOM is because of the extra layers or because of some other side effect of switching from auto to stylegan config. sigh
File "/content/stylegan2-ada-pytorch/training/networks.py", line 168, in forward x = bias_act.bias_act(x, b, act=self.activation, gain=act_gain, clamp=act_clamp) File "/content/stylegan2-ada-pytorch/torch_utils/ops/bias_act.py", line 88, in bias_act return _bias_act_cuda(dim=dim, act=act, alpha=alpha, gain=gain, clamp=clamp).apply(x, b) File "/content/stylegan2-ada-pytorch/torch_utils/ops/bias_act.py", line 153, in forward y = _plugin.bias_act(x, b, _null_tensor, _null_tensor, _null_tensor, 0, dim, spec.cuda_idx, alpha, gain, clamp) RuntimeError: CUDA out of memory. Tried to allocate 256.00 MiB (GPU 0; 15.78 GiB total capacity; 13.95 GiB already allocated; 214.75 MiB free; 14.12 GiB reserved in total by PyTorch)
fwiw, i modified local copy so that "Auto" cfg is set to 8 and i did not (yet) get the OOM that i got with cfg="stylegan2" so the extra memory seems to be side effect of some other value in "stylegan2" cfg beside the map=8 value ... thankfullly
cfg='stylegan2'
is a bit of a hard setting, as it also has mb=32
, which is a lot more compared to the auto
setting. The latter most likely gets you mb=2
due to your image size, unless you modified that part. The original StyleGAN paper tested with different mapping layers (Table 4), and their results are generally better with 8, but it's not always the case (also remember, this is an academic paper and they wish to get the SOTA results, so the difference is actually negligible for regular users). When I tried with 2 mapping layers, the final model was more 'expressive' than the one with 8 mapping layers, but again this could've been a fluke.
I've seen Aydao use 4 mapping layers with 1024 neurons each which, among other things, give greater capacity to the final model than the vanilla network. You can read a summary of the changes in Gwern's blog. In any case, 2 mapping layers should be fine, so long as you are happy with the final model.
thanks ... i will experiment with various mapping. I wonder what you mean by the 2 layer model being more expressive? As in more painterly, lest photorealistic perhaps? Anyway, i will see how they compare on my datasets. As for the 4 mapping with 1024, i assume you mean instead of the 512 that is here? As in 4 layers of 1024 as opposed to 2-8 of 512?
mapping.fc0 262656 - [2, 512] float32 mapping.fc1 262656 - [2, 512] float32
@PDillis An side , looking at Aydao 's page i see he has a mod called stylegan surgery which says it supports non-square images, which is of interest to me. But it seems that must be for tensorflow not pytorch, Are you aware of any stylegan2-ada-pytorch mods for non-square aspect ratios support?
[EDIT] found this mod which may do the non-square images https://pythonrepo.com/repo/eps696-stylegan2ada
Many don't train a rectangular model; instead, what is usually more common is to resize the images to a square, train the model, generate new images, and then just resize these images into the non-square format you had. This way, you avoid having to train a larger model (which is slower), plus the resulting images are still quite good. If you still want a rectangular model, Vadim Epstein's repository is a good resource.
Regarding a more 'expressive' model, remember that GANs tend to better learn the 'mean' image in your dataset. What I meant was that the one with 2 FC layers ended up being able to generate more modes or different types of images compared to an 8 FC one, but again, it might've been a fluke. Your experiments will tell you which model you prefer in the end. If you want to test with 4 FC layers with 1024 neurons each, for example, you do it here:
args.G_kwargs.mapping_kwargs.num_layers = 4
args.G_kwargs.mapping_kwargs.layer_features = 1024
(you can also modify the latent dimension of Z
and W
, number of layers in W
, etc., which mapping_kwargs
take; the full list of kwargs can be found here).
very good, thank you Diego
for rectangle - https://github.com/eps696/stylegan2ada non-square aspect ratio support (auto-picked from dataset; resolution must be divisible by 2**n, such as 512x256, 1280x768, etc.)
Describe the bug I am trying to convert my model to a format which is compatible with sefa (https://github.com/genforce/sefa) layer naming convention. During the conversion process, I found my model has only 2 layers in the latent space mapping module (z->w). To my knowledge from the paper, this is supposed to be 8 and thus conversion fails. I made my model by transfer learning started from
pretrained/transfer-learning-source-nets/ffhq-res256-mirror-paper256-noaug.pkl
which itself has 8 mapping layers as expected. So it is strange why my model has only 2 mapping layers instead of 8 although it is derived from there. You can see there are only two fc* layers below.