PDillis / stylegan3-fun

Modifications of the official PyTorch implementation of StyleGAN3. Let's easily generate images and videos with StyleGAN2/2-ADA/3!
Other
230 stars 36 forks source link

Start from pretrained at different resolution #24

Open EnricoBeltramo opened 2 years ago

EnricoBeltramo commented 2 years ago

Is your feature request related to a problem? Please describe. Is it possible to load a pretrained model at different resolution? I have a pretrained at 512x512 and I would start from it to train a new one at 256x256.

Describe the solution you'd like Automatic recovery of previously trained layers, when they match

Describe alternatives you've considered Resize images, but train at 512 require too much time

PDillis commented 2 years ago

This is an interesting problem, but not too straight forward I'm afraid. What I think could be done is to start a new network, copy the weights at the resolution you want, leave the rest (if any) randomly initialized, then continue training hence. Note that this might not work in StyleGAN3, as both the number of channels and shapes of layers at the same 'resolution' will change depending on the final output resolution. For example, using FFHQU, the layers and shapes at 256x256 are (python generate.py images --network=ffhqu256 --cfg=stylegan3-r --seeds=0 --available-layers):

Name: input => Channels: 1024 => Size: [36, 36]
Name: L0_36_1024 => Channels: 1024 => Size: [36, 36]
Name: L1_36_1024 => Channels: 1024 => Size: [36, 36]
Name: L2_36_1024 => Channels: 1024 => Size: [36, 36]
Name: L3_52_1024 => Channels: 1024 => Size: [52, 52]
Name: L4_52_1024 => Channels: 1024 => Size: [52, 52]
Name: L5_84_1024 => Channels: 1024 => Size: [84, 84]
Name: L6_84_1024 => Channels: 1024 => Size: [84, 84]
Name: L7_148_724 => Channels: 724 => Size: [148, 148]
Name: L8_148_512 => Channels: 512 => Size: [148, 148]
Name: L9_148_362 => Channels: 362 => Size: [148, 148]
Name: L10_276_256 => Channels: 256 => Size: [276, 276]
Name: L11_276_181 => Channels: 181 => Size: [276, 276]
Name: L12_276_128 => Channels: 128 => Size: [276, 276]
Name: L13_256_128 => Channels: 128 => Size: [256, 256]
Name: L14_256_3 => Channels: 3 => Size: [256, 256]
Name: output => Channels: 3 => Size: [256, 256]

At 1024x1024 resolution, these are (python generate.py images --network=ffhqu1024 --cfg=stylegan3-r --seeds=0 --available-layers):

Name: input => Channels: 1024 => Size: [36, 36]
Name: L0_36_1024 => Channels: 1024 => Size: [36, 36]
Name: L1_36_1024 => Channels: 1024 => Size: [36, 36]
Name: L2_52_1024 => Channels: 1024 => Size: [52, 52]
Name: L3_52_1024 => Channels: 1024 => Size: [52, 52]
Name: L4_84_1024 => Channels: 1024 => Size: [84, 84]
Name: L5_148_1024 => Channels: 1024 => Size: [148, 148]
Name: L6_148_1024 => Channels: 1024 => Size: [148, 148]
Name: L7_276_645 => Channels: 645 => Size: [276, 276]
Name: L8_276_406 => Channels: 406 => Size: [276, 276]
Name: L9_532_256 => Channels: 256 => Size: [532, 532]
Name: L10_1044_161 => Channels: 161 => Size: [1044, 1044]
Name: L11_1044_102 => Channels: 102 => Size: [1044, 1044]
Name: L12_1044_64 => Channels: 64 => Size: [1044, 1044]
Name: L13_1024_64 => Channels: 64 => Size: [1024, 1024]
Name: L14_1024_3 => Channels: 3 => Size: [1024, 1024]
Name: output => Channels: 3 => Size: [1024, 1024]

So the number of channels at layer 7 already differ by 79 that the smaller model needs, not to mention the shape. So perhaps this is easier to do in --cfg=stylegan2 (as each block is of a power of 2 and has a torgb operation), which is why @aydao has it in his repo here. I'd be happy to port it over to Pytorch, but let me know if this is what you want before delving into it.

EnricoBeltramo commented 2 years ago

Originally I was planning to use stylegan3, but I have same pretrained for stylegan2, so I guess I could use it too! It should be very helpful in my opinion if you can do that feature! Thank you!