Open NitayGitHub opened 1 week ago
I believe that the model that is shown in the log above is 512x512. Instead you need to use a 256x256 model, such as: https://api.ngc.nvidia.com/v2/models/org/nvidia/team/research/stylegan2/1/files?redirect=true&path=stylegan2-ffhq-256x256.pkl
Also when training a 256x256 model then be sure to include the following attribute in your training parameters.
--cbase=16384
I run !python train.py --outdir=./results --cbase=16384 --snap=10 --img-snap=10 --cfg=stylegan2 --data=./datasets/FH.zip --augpipe=bgc --gpus=2 --metrics=None --gamma=12 --batch=16 --resume='https://api.ngc.nvidia.com/v2/models/org/nvidia/team/research/stylegan2/1/files?redirect=true&path=stylegan2-ffhq-256x256.pkl'
and got the same issue
I realize that I gave you that URL for the 256x256 model, but it's not a valid download link.
Try the code below (as documented here).
!python train.py --outdir=./results --cbase=16384 --snap=10 --img-snap=10 --cfg=stylegan2 --data=./datasets/FH.zip --augpipe=bgc --gpus=2 --metrics=None --gamma=12 --batch=16 --resume=ffhq256
The error comes from the dimensionality in the latent space, as you have at the top of your configuration: "G_kwargs": {..., "z_dim": 256, "w_dim": 256, ...}
. This is bizarre, as we set up the correct dimensionality here (and is the one that the pre-trained models are using). Perhaps there's somewhere else these values are being changed, but I'll have to look into it as the train.py
file only changes this value for --cfg=stylegan2-ext
.
I changed "z_dim" and "w_dim" to 256, thinking it might help but it didn't. However, I believe the problem was with the dataset I used where for some reason some images were not in 256x256 size. I added if img.size != (256, 256): img = img.resize((256, 256))
and it fixed it. Although torchvision transforms.RandomCrop(size=256) should have made sure all images are in 256x256 resolution, it didn't.
Yeah you need to exactly match the model you are finetuning from, otherwise there's no way to use the weights. For the reshaping of your data, do you mean you used dataset_tool.py
and it still resulted in images of different size, or do you have another pipeline there?
Actually, it seems the reason it was fixed was thanks to adding --cbase=16384
Indeed, in my experience the --cbase=16384
is required when fine-tuning a 256x256 model. Otherwise it will throw an error prompt.
Describe the bug I wanted to use transfer learning on 256x256 pkl and my data contains images of 256x256 yet I got this error.