Open EnricoBeltramo opened 2 years ago
This is an interesting problem, but not too straight forward I'm afraid. What I think could be done is to start a new network, copy the weights at the resolution you want, leave the rest (if any) randomly initialized, then continue training hence. Note that this might not work in StyleGAN3, as both the number of channels and shapes of layers at the same 'resolution' will change depending on the final output resolution. For example, using FFHQU, the layers and shapes at 256x256
are (python generate.py images --network=ffhqu256 --cfg=stylegan3-r --seeds=0 --available-layers
):
Name: input => Channels: 1024 => Size: [36, 36]
Name: L0_36_1024 => Channels: 1024 => Size: [36, 36]
Name: L1_36_1024 => Channels: 1024 => Size: [36, 36]
Name: L2_36_1024 => Channels: 1024 => Size: [36, 36]
Name: L3_52_1024 => Channels: 1024 => Size: [52, 52]
Name: L4_52_1024 => Channels: 1024 => Size: [52, 52]
Name: L5_84_1024 => Channels: 1024 => Size: [84, 84]
Name: L6_84_1024 => Channels: 1024 => Size: [84, 84]
Name: L7_148_724 => Channels: 724 => Size: [148, 148]
Name: L8_148_512 => Channels: 512 => Size: [148, 148]
Name: L9_148_362 => Channels: 362 => Size: [148, 148]
Name: L10_276_256 => Channels: 256 => Size: [276, 276]
Name: L11_276_181 => Channels: 181 => Size: [276, 276]
Name: L12_276_128 => Channels: 128 => Size: [276, 276]
Name: L13_256_128 => Channels: 128 => Size: [256, 256]
Name: L14_256_3 => Channels: 3 => Size: [256, 256]
Name: output => Channels: 3 => Size: [256, 256]
At 1024x1024
resolution, these are (python generate.py images --network=ffhqu1024 --cfg=stylegan3-r --seeds=0 --available-layers
):
Name: input => Channels: 1024 => Size: [36, 36]
Name: L0_36_1024 => Channels: 1024 => Size: [36, 36]
Name: L1_36_1024 => Channels: 1024 => Size: [36, 36]
Name: L2_52_1024 => Channels: 1024 => Size: [52, 52]
Name: L3_52_1024 => Channels: 1024 => Size: [52, 52]
Name: L4_84_1024 => Channels: 1024 => Size: [84, 84]
Name: L5_148_1024 => Channels: 1024 => Size: [148, 148]
Name: L6_148_1024 => Channels: 1024 => Size: [148, 148]
Name: L7_276_645 => Channels: 645 => Size: [276, 276]
Name: L8_276_406 => Channels: 406 => Size: [276, 276]
Name: L9_532_256 => Channels: 256 => Size: [532, 532]
Name: L10_1044_161 => Channels: 161 => Size: [1044, 1044]
Name: L11_1044_102 => Channels: 102 => Size: [1044, 1044]
Name: L12_1044_64 => Channels: 64 => Size: [1044, 1044]
Name: L13_1024_64 => Channels: 64 => Size: [1024, 1024]
Name: L14_1024_3 => Channels: 3 => Size: [1024, 1024]
Name: output => Channels: 3 => Size: [1024, 1024]
So the number of channels at layer 7 already differ by 79 that the smaller model needs, not to mention the shape. So perhaps this is easier to do in --cfg=stylegan2
(as each block is of a power of 2 and has a torgb
operation), which is why @aydao has it in his repo here. I'd be happy to port it over to Pytorch, but let me know if this is what you want before delving into it.
Originally I was planning to use stylegan3, but I have same pretrained for stylegan2, so I guess I could use it too! It should be very helpful in my opinion if you can do that feature! Thank you!
Is your feature request related to a problem? Please describe. Is it possible to load a pretrained model at different resolution? I have a pretrained at 512x512 and I would start from it to train a new one at 256x256.
Describe the solution you'd like Automatic recovery of previously trained layers, when they match
Describe alternatives you've considered Resize images, but train at 512 require too much time