justinpinkney / toonify

606 stars 73 forks source link

How to train with 512x512 or higher #16

Closed youjin-c closed 2 years ago

youjin-c commented 2 years ago

Hello @justinpinkney, Thank you for your awesome project! I am trying to play with Toonify with some other StyleGAN models, starting with some pre-trained models. It seems like the current blending is done with the 256x256 NASA model. When I tried the 512x512 cartoon model, it shows errors like the below: image

Is there any hints or comment to fix this one?

Best regards, Youjin Chung

justinpinkney commented 2 years ago

Are both model 512x512?

youjin-c commented 2 years ago

No, I don't think so. So that was the problem. Thank you! Let me play with them further and come back. I appreciate your help.

youjin-c commented 2 years ago

Hello @justinpinkney, Now I am training a model of resolution 512x512 with your StyleGAN2 repo. and tried to blend one snapshot with an ffhq 512x512 model. I printed out the conv layers to see what both models look like, and it seems they match. However, I still get the same error as above.

Setting up TensorFlow plugin "fused_bias_act.cu": Preprocessing... Loading... Done.
Setting up TensorFlow plugin "upfirdn_2d.cu": Preprocessing... Loading... Done.
[('G_synthesis/4x4/Const/const', '4x4', 0, 0), ('G_synthesis/4x4/ToRGB/weight', '4x4', 1, 1), ('G_synthesis/4x4/ToRGB/mod_weight', '4x4', 1, 1), ('G_synthesis/4x4/ToRGB/mod_bias', '4x4', 1, 1), ('G_synthesis/4x4/ToRGB/bias', '4x4', 1, 1), ('G_synthesis/8x8/Conv0_up/weight', '8x8', 0, 2), ('G_synthesis/8x8/Conv0_up/mod_weight', '8x8', 0, 2), ('G_synthesis/8x8/Conv0_up/mod_bias', '8x8', 0, 2), ('G_synthesis/8x8/Conv0_up/noise_strength', '8x8', 0, 2), ('G_synthesis/8x8/Conv0_up/bias', '8x8', 0, 2), ('G_synthesis/8x8/Conv1/weight', '8x8', 1, 3), ('G_synthesis/8x8/Conv1/mod_weight', '8x8', 1, 3), ('G_synthesis/8x8/Conv1/mod_bias', '8x8', 1, 3), ('G_synthesis/8x8/Conv1/noise_strength', '8x8', 1, 3), ('G_synthesis/8x8/Conv1/bias', '8x8', 1, 3), ('G_synthesis/8x8/ToRGB/weight', '8x8', 1, 3), ('G_synthesis/8x8/ToRGB/mod_weight', '8x8', 1, 3), ('G_synthesis/8x8/ToRGB/mod_bias', '8x8', 1, 3), ('G_synthesis/8x8/ToRGB/bias', '8x8', 1, 3), ('G_synthesis/16x16/Conv0_up/weight', '16x16', 0, 4), ('G_synthesis/16x16/Conv0_up/mod_weight', '16x16', 0, 4), ('G_synthesis/16x16/Conv0_up/mod_bias', '16x16', 0, 4), ('G_synthesis/16x16/Conv0_up/noise_strength', '16x16', 0, 4), ('G_synthesis/16x16/Conv0_up/bias', '16x16', 0, 4), ('G_synthesis/16x16/Conv1/weight', '16x16', 1, 5), ('G_synthesis/16x16/Conv1/mod_weight', '16x16', 1, 5), ('G_synthesis/16x16/Conv1/mod_bias', '16x16', 1, 5), ('G_synthesis/16x16/Conv1/noise_strength', '16x16', 1, 5), ('G_synthesis/16x16/Conv1/bias', '16x16', 1, 5), ('G_synthesis/16x16/ToRGB/weight', '16x16', 1, 5), ('G_synthesis/16x16/ToRGB/mod_weight', '16x16', 1, 5), ('G_synthesis/16x16/ToRGB/mod_bias', '16x16', 1, 5), ('G_synthesis/16x16/ToRGB/bias', '16x16', 1, 5), ('G_synthesis/32x32/Conv0_up/weight', '32x32', 0, 6), ('G_synthesis/32x32/Conv0_up/mod_weight', '32x32', 0, 6), ('G_synthesis/32x32/Conv0_up/mod_bias', '32x32', 0, 6), ('G_synthesis/32x32/Conv0_up/noise_strength', '32x32', 0, 6), ('G_synthesis/32x32/Conv0_up/bias', '32x32', 0, 6), ('G_synthesis/32x32/Conv1/weight', '32x32', 1, 7), ('G_synthesis/32x32/Conv1/mod_weight', '32x32', 1, 7), ('G_synthesis/32x32/Conv1/mod_bias', '32x32', 1, 7), ('G_synthesis/32x32/Conv1/noise_strength', '32x32', 1, 7), ('G_synthesis/32x32/Conv1/bias', '32x32', 1, 7), ('G_synthesis/32x32/ToRGB/weight', '32x32', 1, 7), ('G_synthesis/32x32/ToRGB/mod_weight', '32x32', 1, 7), ('G_synthesis/32x32/ToRGB/mod_bias', '32x32', 1, 7), ('G_synthesis/32x32/ToRGB/bias', '32x32', 1, 7), ('G_synthesis/64x64/Conv0_up/weight', '64x64', 0, 8), ('G_synthesis/64x64/Conv0_up/mod_weight', '64x64', 0, 8), ('G_synthesis/64x64/Conv0_up/mod_bias', '64x64', 0, 8), ('G_synthesis/64x64/Conv0_up/noise_strength', '64x64', 0, 8), ('G_synthesis/64x64/Conv0_up/bias', '64x64', 0, 8), ('G_synthesis/64x64/Conv1/weight', '64x64', 1, 9), ('G_synthesis/64x64/Conv1/mod_weight', '64x64', 1, 9), ('G_synthesis/64x64/Conv1/mod_bias', '64x64', 1, 9), ('G_synthesis/64x64/Conv1/noise_strength', '64x64', 1, 9), ('G_synthesis/64x64/Conv1/bias', '64x64', 1, 9), ('G_synthesis/64x64/ToRGB/weight', '64x64', 1, 9), ('G_synthesis/64x64/ToRGB/mod_weight', '64x64', 1, 9), ('G_synthesis/64x64/ToRGB/mod_bias', '64x64', 1, 9), ('G_synthesis/64x64/ToRGB/bias', '64x64', 1, 9), ('G_synthesis/128x128/Conv0_up/weight', '128x128', 0, 10), ('G_synthesis/128x128/Conv0_up/mod_weight', '128x128', 0, 10), ('G_synthesis/128x128/Conv0_up/mod_bias', '128x128', 0, 10), ('G_synthesis/128x128/Conv0_up/noise_strength', '128x128', 0, 10), ('G_synthesis/128x128/Conv0_up/bias', '128x128', 0, 10), ('G_synthesis/128x128/Conv1/weight', '128x128', 1, 11), ('G_synthesis/128x128/Conv1/mod_weight', '128x128', 1, 11), ('G_synthesis/128x128/Conv1/mod_bias', '128x128', 1, 11), ('G_synthesis/128x128/Conv1/noise_strength', '128x128', 1, 11), ('G_synthesis/128x128/Conv1/bias', '128x128', 1, 11), ('G_synthesis/128x128/ToRGB/weight', '128x128', 1, 11), ('G_synthesis/128x128/ToRGB/mod_weight', '128x128', 1, 11), ('G_synthesis/128x128/ToRGB/mod_bias', '128x128', 1, 11), ('G_synthesis/128x128/ToRGB/bias', '128x128', 1, 11), ('G_synthesis/256x256/Conv0_up/weight', '256x256', 0, 12), ('G_synthesis/256x256/Conv0_up/mod_weight', '256x256', 0, 12), ('G_synthesis/256x256/Conv0_up/mod_bias', '256x256', 0, 12), ('G_synthesis/256x256/Conv0_up/noise_strength', '256x256', 0, 12), ('G_synthesis/256x256/Conv0_up/bias', '256x256', 0, 12), ('G_synthesis/256x256/Conv1/weight', '256x256', 1, 13), ('G_synthesis/256x256/Conv1/mod_weight', '256x256', 1, 13), ('G_synthesis/256x256/Conv1/mod_bias', '256x256', 1, 13), ('G_synthesis/256x256/Conv1/noise_strength', '256x256', 1, 13), ('G_synthesis/256x256/Conv1/bias', '256x256', 1, 13), ('G_synthesis/256x256/ToRGB/weight', '256x256', 1, 13), ('G_synthesis/256x256/ToRGB/mod_weight', '256x256', 1, 13), ('G_synthesis/256x256/ToRGB/mod_bias', '256x256', 1, 13), ('G_synthesis/256x256/ToRGB/bias', '256x256', 1, 13), ('G_synthesis/512x512/Conv0_up/weight', '512x512', 0, 14), ('G_synthesis/512x512/Conv0_up/mod_weight', '512x512', 0, 14), ('G_synthesis/512x512/Conv0_up/mod_bias', '512x512', 0, 14), ('G_synthesis/512x512/Conv0_up/noise_strength', '512x512', 0, 14), ('G_synthesis/512x512/Conv0_up/bias', '512x512', 0, 14), ('G_synthesis/512x512/Conv1/weight', '512x512', 1, 15), ('G_synthesis/512x512/Conv1/mod_weight', '512x512', 1, 15), ('G_synthesis/512x512/Conv1/mod_bias', '512x512', 1, 15), ('G_synthesis/512x512/Conv1/noise_strength', '512x512', 1, 15), ('G_synthesis/512x512/Conv1/bias', '512x512', 1, 15), ('G_synthesis/512x512/ToRGB/weight', '512x512', 1, 15), ('G_synthesis/512x512/ToRGB/mod_weight', '512x512', 1, 15), ('G_synthesis/512x512/ToRGB/mod_bias', '512x512', 1, 15), ('G_synthesis/512x512/ToRGB/bias', '512x512', 1, 15)]
[('G_synthesis/4x4/Const/const', '4x4', 0, 0), ('G_synthesis/4x4/ToRGB/weight', '4x4', 1, 1), ('G_synthesis/4x4/ToRGB/mod_weight', '4x4', 1, 1), ('G_synthesis/4x4/ToRGB/mod_bias', '4x4', 1, 1), ('G_synthesis/4x4/ToRGB/bias', '4x4', 1, 1), ('G_synthesis/8x8/Conv0_up/weight', '8x8', 0, 2), ('G_synthesis/8x8/Conv0_up/mod_weight', '8x8', 0, 2), ('G_synthesis/8x8/Conv0_up/mod_bias', '8x8', 0, 2), ('G_synthesis/8x8/Conv0_up/noise_strength', '8x8', 0, 2), ('G_synthesis/8x8/Conv0_up/bias', '8x8', 0, 2), ('G_synthesis/8x8/Conv1/weight', '8x8', 1, 3), ('G_synthesis/8x8/Conv1/mod_weight', '8x8', 1, 3), ('G_synthesis/8x8/Conv1/mod_bias', '8x8', 1, 3), ('G_synthesis/8x8/Conv1/noise_strength', '8x8', 1, 3), ('G_synthesis/8x8/Conv1/bias', '8x8', 1, 3), ('G_synthesis/8x8/ToRGB/weight', '8x8', 1, 3), ('G_synthesis/8x8/ToRGB/mod_weight', '8x8', 1, 3), ('G_synthesis/8x8/ToRGB/mod_bias', '8x8', 1, 3), ('G_synthesis/8x8/ToRGB/bias', '8x8', 1, 3), ('G_synthesis/16x16/Conv0_up/weight', '16x16', 0, 4), ('G_synthesis/16x16/Conv0_up/mod_weight', '16x16', 0, 4), ('G_synthesis/16x16/Conv0_up/mod_bias', '16x16', 0, 4), ('G_synthesis/16x16/Conv0_up/noise_strength', '16x16', 0, 4), ('G_synthesis/16x16/Conv0_up/bias', '16x16', 0, 4), ('G_synthesis/16x16/Conv1/weight', '16x16', 1, 5), ('G_synthesis/16x16/Conv1/mod_weight', '16x16', 1, 5), ('G_synthesis/16x16/Conv1/mod_bias', '16x16', 1, 5), ('G_synthesis/16x16/Conv1/noise_strength', '16x16', 1, 5), ('G_synthesis/16x16/Conv1/bias', '16x16', 1, 5), ('G_synthesis/16x16/ToRGB/weight', '16x16', 1, 5), ('G_synthesis/16x16/ToRGB/mod_weight', '16x16', 1, 5), ('G_synthesis/16x16/ToRGB/mod_bias', '16x16', 1, 5), ('G_synthesis/16x16/ToRGB/bias', '16x16', 1, 5), ('G_synthesis/32x32/Conv0_up/weight', '32x32', 0, 6), ('G_synthesis/32x32/Conv0_up/mod_weight', '32x32', 0, 6), ('G_synthesis/32x32/Conv0_up/mod_bias', '32x32', 0, 6), ('G_synthesis/32x32/Conv0_up/noise_strength', '32x32', 0, 6), ('G_synthesis/32x32/Conv0_up/bias', '32x32', 0, 6), ('G_synthesis/32x32/Conv1/weight', '32x32', 1, 7), ('G_synthesis/32x32/Conv1/mod_weight', '32x32', 1, 7), ('G_synthesis/32x32/Conv1/mod_bias', '32x32', 1, 7), ('G_synthesis/32x32/Conv1/noise_strength', '32x32', 1, 7), ('G_synthesis/32x32/Conv1/bias', '32x32', 1, 7), ('G_synthesis/32x32/ToRGB/weight', '32x32', 1, 7), ('G_synthesis/32x32/ToRGB/mod_weight', '32x32', 1, 7), ('G_synthesis/32x32/ToRGB/mod_bias', '32x32', 1, 7), ('G_synthesis/32x32/ToRGB/bias', '32x32', 1, 7), ('G_synthesis/64x64/Conv0_up/weight', '64x64', 0, 8), ('G_synthesis/64x64/Conv0_up/mod_weight', '64x64', 0, 8), ('G_synthesis/64x64/Conv0_up/mod_bias', '64x64', 0, 8), ('G_synthesis/64x64/Conv0_up/noise_strength', '64x64', 0, 8), ('G_synthesis/64x64/Conv0_up/bias', '64x64', 0, 8), ('G_synthesis/64x64/Conv1/weight', '64x64', 1, 9), ('G_synthesis/64x64/Conv1/mod_weight', '64x64', 1, 9), ('G_synthesis/64x64/Conv1/mod_bias', '64x64', 1, 9), ('G_synthesis/64x64/Conv1/noise_strength', '64x64', 1, 9), ('G_synthesis/64x64/Conv1/bias', '64x64', 1, 9), ('G_synthesis/64x64/ToRGB/weight', '64x64', 1, 9), ('G_synthesis/64x64/ToRGB/mod_weight', '64x64', 1, 9), ('G_synthesis/64x64/ToRGB/mod_bias', '64x64', 1, 9), ('G_synthesis/64x64/ToRGB/bias', '64x64', 1, 9), ('G_synthesis/128x128/Conv0_up/weight', '128x128', 0, 10), ('G_synthesis/128x128/Conv0_up/mod_weight', '128x128', 0, 10), ('G_synthesis/128x128/Conv0_up/mod_bias', '128x128', 0, 10), ('G_synthesis/128x128/Conv0_up/noise_strength', '128x128', 0, 10), ('G_synthesis/128x128/Conv0_up/bias', '128x128', 0, 10), ('G_synthesis/128x128/Conv1/weight', '128x128', 1, 11), ('G_synthesis/128x128/Conv1/mod_weight', '128x128', 1, 11), ('G_synthesis/128x128/Conv1/mod_bias', '128x128', 1, 11), ('G_synthesis/128x128/Conv1/noise_strength', '128x128', 1, 11), ('G_synthesis/128x128/Conv1/bias', '128x128', 1, 11), ('G_synthesis/128x128/ToRGB/weight', '128x128', 1, 11), ('G_synthesis/128x128/ToRGB/mod_weight', '128x128', 1, 11), ('G_synthesis/128x128/ToRGB/mod_bias', '128x128', 1, 11), ('G_synthesis/128x128/ToRGB/bias', '128x128', 1, 11), ('G_synthesis/256x256/Conv0_up/weight', '256x256', 0, 12), ('G_synthesis/256x256/Conv0_up/mod_weight', '256x256', 0, 12), ('G_synthesis/256x256/Conv0_up/mod_bias', '256x256', 0, 12), ('G_synthesis/256x256/Conv0_up/noise_strength', '256x256', 0, 12), ('G_synthesis/256x256/Conv0_up/bias', '256x256', 0, 12), ('G_synthesis/256x256/Conv1/weight', '256x256', 1, 13), ('G_synthesis/256x256/Conv1/mod_weight', '256x256', 1, 13), ('G_synthesis/256x256/Conv1/mod_bias', '256x256', 1, 13), ('G_synthesis/256x256/Conv1/noise_strength', '256x256', 1, 13), ('G_synthesis/256x256/Conv1/bias', '256x256', 1, 13), ('G_synthesis/256x256/ToRGB/weight', '256x256', 1, 13), ('G_synthesis/256x256/ToRGB/mod_weight', '256x256', 1, 13), ('G_synthesis/256x256/ToRGB/mod_bias', '256x256', 1, 13), ('G_synthesis/256x256/ToRGB/bias', '256x256', 1, 13), ('G_synthesis/512x512/Conv0_up/weight', '512x512', 0, 14), ('G_synthesis/512x512/Conv0_up/mod_weight', '512x512', 0, 14), ('G_synthesis/512x512/Conv0_up/mod_bias', '512x512', 0, 14), ('G_synthesis/512x512/Conv0_up/noise_strength', '512x512', 0, 14), ('G_synthesis/512x512/Conv0_up/bias', '512x512', 0, 14), ('G_synthesis/512x512/Conv1/weight', '512x512', 1, 15), ('G_synthesis/512x512/Conv1/mod_weight', '512x512', 1, 15), ('G_synthesis/512x512/Conv1/mod_bias', '512x512', 1, 15), ('G_synthesis/512x512/Conv1/noise_strength', '512x512', 1, 15), ('G_synthesis/512x512/Conv1/bias', '512x512', 1, 15), ('G_synthesis/512x512/ToRGB/weight', '512x512', 1, 15), ('G_synthesis/512x512/ToRGB/mod_weight', '512x512', 1, 15), ('G_synthesis/512x512/ToRGB/mod_bias', '512x512', 1, 15), ('G_synthesis/512x512/ToRGB/bias', '512x512', 1, 15)]
Blending G_synthesis/4x4/Const/const by 0
Blending G_synthesis/4x4/ToRGB/weight by 0
Blending G_synthesis/4x4/ToRGB/mod_weight by 0
Blending G_synthesis/4x4/ToRGB/mod_bias by 0
Blending G_synthesis/4x4/ToRGB/bias by 0
Blending G_synthesis/8x8/Conv0_up/weight by 0
Blending G_synthesis/8x8/Conv0_up/mod_weight by 0
Blending G_synthesis/8x8/Conv0_up/mod_bias by 0
Blending G_synthesis/8x8/Conv0_up/noise_strength by 0
Blending G_synthesis/8x8/Conv0_up/bias by 0
Blending G_synthesis/8x8/Conv1/weight by 0
Blending G_synthesis/8x8/Conv1/mod_weight by 0
Blending G_synthesis/8x8/Conv1/mod_bias by 0
Blending G_synthesis/8x8/Conv1/noise_strength by 0
Blending G_synthesis/8x8/Conv1/bias by 0
Blending G_synthesis/8x8/ToRGB/weight by 0
Blending G_synthesis/8x8/ToRGB/mod_weight by 0
Blending G_synthesis/8x8/ToRGB/mod_bias by 0
Blending G_synthesis/8x8/ToRGB/bias by 0
Blending G_synthesis/16x16/Conv0_up/weight by 1
Blending G_synthesis/16x16/Conv0_up/mod_weight by 1
Blending G_synthesis/16x16/Conv0_up/mod_bias by 1
Blending G_synthesis/16x16/Conv0_up/noise_strength by 1
Blending G_synthesis/16x16/Conv0_up/bias by 1
Blending G_synthesis/16x16/Conv1/weight by 1
Blending G_synthesis/16x16/Conv1/mod_weight by 1
Blending G_synthesis/16x16/Conv1/mod_bias by 1
Blending G_synthesis/16x16/Conv1/noise_strength by 1
Blending G_synthesis/16x16/Conv1/bias by 1
Blending G_synthesis/16x16/ToRGB/weight by 1
Blending G_synthesis/16x16/ToRGB/mod_weight by 1
Blending G_synthesis/16x16/ToRGB/mod_bias by 1
Blending G_synthesis/16x16/ToRGB/bias by 1
Blending G_synthesis/32x32/Conv0_up/weight by 1
Blending G_synthesis/32x32/Conv0_up/mod_weight by 1
Blending G_synthesis/32x32/Conv0_up/mod_bias by 1
Blending G_synthesis/32x32/Conv0_up/noise_strength by 1
Blending G_synthesis/32x32/Conv0_up/bias by 1
Blending G_synthesis/32x32/Conv1/weight by 1
Blending G_synthesis/32x32/Conv1/mod_weight by 1
Blending G_synthesis/32x32/Conv1/mod_bias by 1
Blending G_synthesis/32x32/Conv1/noise_strength by 1
Blending G_synthesis/32x32/Conv1/bias by 1
Blending G_synthesis/32x32/ToRGB/weight by 1
Blending G_synthesis/32x32/ToRGB/mod_weight by 1
Blending G_synthesis/32x32/ToRGB/mod_bias by 1
Blending G_synthesis/32x32/ToRGB/bias by 1
Blending G_synthesis/64x64/Conv0_up/weight by 1
Blending G_synthesis/64x64/Conv0_up/mod_weight by 1
Blending G_synthesis/64x64/Conv0_up/mod_bias by 1
Blending G_synthesis/64x64/Conv0_up/noise_strength by 1
Blending G_synthesis/64x64/Conv0_up/bias by 1
Blending G_synthesis/64x64/Conv1/weight by 1
Blending G_synthesis/64x64/Conv1/mod_weight by 1
Blending G_synthesis/64x64/Conv1/mod_bias by 1
Blending G_synthesis/64x64/Conv1/noise_strength by 1
Blending G_synthesis/64x64/Conv1/bias by 1
Blending G_synthesis/64x64/ToRGB/weight by 1
Blending G_synthesis/64x64/ToRGB/mod_weight by 1
Blending G_synthesis/64x64/ToRGB/mod_bias by 1
Blending G_synthesis/64x64/ToRGB/bias by 1
Blending G_synthesis/128x128/Conv0_up/weight by 1
Blending G_synthesis/128x128/Conv0_up/mod_weight by 1
Blending G_synthesis/128x128/Conv0_up/mod_bias by 1
Blending G_synthesis/128x128/Conv0_up/noise_strength by 1
Blending G_synthesis/128x128/Conv0_up/bias by 1
Blending G_synthesis/128x128/Conv1/weight by 1
Blending G_synthesis/128x128/Conv1/mod_weight by 1
Blending G_synthesis/128x128/Conv1/mod_bias by 1
Blending G_synthesis/128x128/Conv1/noise_strength by 1
Blending G_synthesis/128x128/Conv1/bias by 1
Blending G_synthesis/128x128/ToRGB/weight by 1
Blending G_synthesis/128x128/ToRGB/mod_weight by 1
Blending G_synthesis/128x128/ToRGB/mod_bias by 1
Blending G_synthesis/128x128/ToRGB/bias by 1
Blending G_synthesis/256x256/Conv0_up/weight by 1
Blending G_synthesis/256x256/Conv0_up/mod_weight by 1
Blending G_synthesis/256x256/Conv0_up/mod_bias by 1
Blending G_synthesis/256x256/Conv0_up/noise_strength by 1
Blending G_synthesis/256x256/Conv0_up/bias by 1
Blending G_synthesis/256x256/Conv1/weight by 1
Blending G_synthesis/256x256/Conv1/mod_weight by 1
Blending G_synthesis/256x256/Conv1/mod_bias by 1
Blending G_synthesis/256x256/Conv1/noise_strength by 1
Blending G_synthesis/256x256/Conv1/bias by 1
Blending G_synthesis/256x256/ToRGB/weight by 1
Blending G_synthesis/256x256/ToRGB/mod_weight by 1
Blending G_synthesis/256x256/ToRGB/mod_bias by 1
Blending G_synthesis/256x256/ToRGB/bias by 1
Blending G_synthesis/512x512/Conv0_up/weight by 1
Blending G_synthesis/512x512/Conv0_up/mod_weight by 1
Blending G_synthesis/512x512/Conv0_up/mod_bias by 1
Blending G_synthesis/512x512/Conv0_up/noise_strength by 1
Blending G_synthesis/512x512/Conv0_up/bias by 1
Blending G_synthesis/512x512/Conv1/weight by 1
Blending G_synthesis/512x512/Conv1/mod_weight by 1
Blending G_synthesis/512x512/Conv1/mod_bias by 1
Blending G_synthesis/512x512/Conv1/noise_strength by 1
Blending G_synthesis/512x512/Conv1/bias by 1
Blending G_synthesis/512x512/ToRGB/weight by 1
Blending G_synthesis/512x512/ToRGB/mod_weight by 1
Blending G_synthesis/512x512/ToRGB/mod_bias by 1
Blending G_synthesis/512x512/ToRGB/bias by 1
---------------------------------------------------------------------------
InvalidArgumentError                      Traceback (most recent call last)
[/tensorflow-1.15.2/python3.7/tensorflow_core/python/framework/ops.py](https://localhost:8080/#) in _create_c_op(graph, node_def, inputs, control_inputs)
   1606   try:
-> 1607     c_op = c_api.TF_FinishOperation(op_desc)
   1608   except errors.InvalidArgumentError as e:

InvalidArgumentError: Dimensions must be equal, but are 512 and 256 for 'add_47' (op: 'AddV2') with input shapes: [3,3,512,512], [3,3,512,256].

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
12 frames
[/tensorflow-1.15.2/python3.7/tensorflow_core/python/framework/ops.py](https://localhost:8080/#) in _create_c_op(graph, node_def, inputs, control_inputs)
   1608   except errors.InvalidArgumentError as e:
   1609     # Convert to ValueError for backwards compatibility.
-> 1610     raise ValueError(str(e))
   1611 
   1612   return c_op

ValueError: Dimensions must be equal, but are 512 and 256 for 'add_47' (op: 'AddV2') with input shapes: [3,3,512,512], [3,3,512,256].

Also, here is the command I am using to train the model, using all 512x512 images.

!python run_training.py --num-gpus=1 --data-dir='/content/dataset' --config=config-e --dataset=low_poly --mirror-augment=true --metric=none --total-kimg=800 --min-h=4 --min-w=4 --res-log2=7 --result-dir='/content/Google/My Drive/stylegan2/results'
Local submit - run_dir: /content/Google/My Drive/stylegan2/results/00002-stylegan2-low_poly-1gpu-config-e
dnnlib: Running training.training_loop.training_loop() on localhost...
Streaming data using training.dataset.TFRecordDataset...
Dataset shape = [3, 512, 512]
Dynamic range = [0, 255]
Label size    = 0
Constructing networks...
Setting up TensorFlow plugin "fused_bias_act.cu": Preprocessing... Loading... Done.
Setting up TensorFlow plugin "upfirdn_2d.cu": Preprocessing... Loading... Done.

G                             Params    OutputShape         WeightShape     
---                           ---       ---                 ---             
latents_in                    -         (?, 512)            -               
labels_in                     -         (?, 0)              -               
lod                           -         ()                  -               
dlatent_avg                   -         (512,)              -               
G_mapping/latents_in          -         (?, 512)            -               
G_mapping/labels_in           -         (?, 0)              -               
G_mapping/Normalize           -         (?, 512)            -               
G_mapping/Dense0              262656    (?, 512)            (512, 512)      
G_mapping/Dense1              262656    (?, 512)            (512, 512)      
G_mapping/Dense2              262656    (?, 512)            (512, 512)      
G_mapping/Dense3              262656    (?, 512)            (512, 512)      
G_mapping/Dense4              262656    (?, 512)            (512, 512)      
G_mapping/Dense5              262656    (?, 512)            (512, 512)      
G_mapping/Dense6              262656    (?, 512)            (512, 512)      
G_mapping/Dense7              262656    (?, 512)            (512, 512)      
G_mapping/Broadcast           -         (?, 16, 512)        -               
G_mapping/dlatents_out        -         (?, 16, 512)        -               
Truncation/Lerp               -         (?, 16, 512)        -               
G_synthesis/dlatents_in       -         (?, 16, 512)        -               
G_synthesis/4x4/Const         8192      (?, 512, 4, 4)      (1, 512, 4, 4)  
G_synthesis/4x4/Conv          2622465   (?, 512, 4, 4)      (3, 3, 512, 512)
G_synthesis/4x4/ToRGB         264195    (?, 3, 4, 4)        (1, 1, 512, 3)  
G_synthesis/8x8/Conv0_up      2622465   (?, 512, 8, 8)      (3, 3, 512, 512)
G_synthesis/8x8/Conv1         2622465   (?, 512, 8, 8)      (3, 3, 512, 512)
G_synthesis/8x8/Upsample      -         (?, 3, 8, 8)        -               
G_synthesis/8x8/ToRGB         264195    (?, 3, 8, 8)        (1, 1, 512, 3)  
G_synthesis/16x16/Conv0_up    2622465   (?, 512, 16, 16)    (3, 3, 512, 512)
G_synthesis/16x16/Conv1       2622465   (?, 512, 16, 16)    (3, 3, 512, 512)
G_synthesis/16x16/Upsample    -         (?, 3, 16, 16)      -               
G_synthesis/16x16/ToRGB       264195    (?, 3, 16, 16)      (1, 1, 512, 3)  
G_synthesis/32x32/Conv0_up    2622465   (?, 512, 32, 32)    (3, 3, 512, 512)
G_synthesis/32x32/Conv1       2622465   (?, 512, 32, 32)    (3, 3, 512, 512)
G_synthesis/32x32/Upsample    -         (?, 3, 32, 32)      -               
G_synthesis/32x32/ToRGB       264195    (?, 3, 32, 32)      (1, 1, 512, 3)  
G_synthesis/64x64/Conv0_up    1442561   (?, 256, 64, 64)    (3, 3, 512, 256)
G_synthesis/64x64/Conv1       721409    (?, 256, 64, 64)    (3, 3, 256, 256)
G_synthesis/64x64/Upsample    -         (?, 3, 64, 64)      -               
G_synthesis/64x64/ToRGB       132099    (?, 3, 64, 64)      (1, 1, 256, 3)  
G_synthesis/128x128/Conv0_up  426369    (?, 128, 128, 128)  (3, 3, 256, 128)
G_synthesis/128x128/Conv1     213249    (?, 128, 128, 128)  (3, 3, 128, 128)
G_synthesis/128x128/Upsample  -         (?, 3, 128, 128)    -               
G_synthesis/128x128/ToRGB     66051     (?, 3, 128, 128)    (1, 1, 128, 3)  
G_synthesis/256x256/Conv0_up  139457    (?, 64, 256, 256)   (3, 3, 128, 64) 
G_synthesis/256x256/Conv1     69761     (?, 64, 256, 256)   (3, 3, 64, 64)  
G_synthesis/256x256/Upsample  -         (?, 3, 256, 256)    -               
G_synthesis/256x256/ToRGB     33027     (?, 3, 256, 256)    (1, 1, 64, 3)   
G_synthesis/512x512/Conv0_up  51297     (?, 32, 512, 512)   (3, 3, 64, 32)  
G_synthesis/512x512/Conv1     25665     (?, 32, 512, 512)   (3, 3, 32, 32)  
G_synthesis/512x512/Upsample  -         (?, 3, 512, 512)    -               
G_synthesis/512x512/ToRGB     16515     (?, 3, 512, 512)    (1, 1, 32, 3)   
G_synthesis/images_out        -         (?, 3, 512, 512)    -               
G_synthesis/noise0            -         (1, 1, 4, 4)        -               
G_synthesis/noise1            -         (1, 1, 8, 8)        -               
G_synthesis/noise2            -         (1, 1, 8, 8)        -               
G_synthesis/noise3            -         (1, 1, 16, 16)      -               
G_synthesis/noise4            -         (1, 1, 16, 16)      -               
G_synthesis/noise5            -         (1, 1, 32, 32)      -               
G_synthesis/noise6            -         (1, 1, 32, 32)      -               
G_synthesis/noise7            -         (1, 1, 64, 64)      -               
G_synthesis/noise8            -         (1, 1, 64, 64)      -               
G_synthesis/noise9            -         (1, 1, 128, 128)    -               
G_synthesis/noise10           -         (1, 1, 128, 128)    -               
G_synthesis/noise11           -         (1, 1, 256, 256)    -               
G_synthesis/noise12           -         (1, 1, 256, 256)    -               
G_synthesis/noise13           -         (1, 1, 512, 512)    -               
G_synthesis/noise14           -         (1, 1, 512, 512)    -               
images_out                    -         (?, 3, 512, 512)    -               
---                           ---       ---                 ---             
Total                         24860935                                      

D                    Params    OutputShape         WeightShape     
---                  ---       ---                 ---             
images_in            -         (?, 3, 512, 512)    -               
labels_in            -         (?, 0)              -               
512x512/FromRGB      128       (?, 32, 512, 512)   (1, 1, 3, 32)   
512x512/Conv0        9248      (?, 32, 512, 512)   (3, 3, 32, 32)  
512x512/Conv1_down   18496     (?, 64, 256, 256)   (3, 3, 32, 64)  
512x512/Skip         2048      (?, 64, 256, 256)   (1, 1, 32, 64)  
256x256/Conv0        36928     (?, 64, 256, 256)   (3, 3, 64, 64)  
256x256/Conv1_down   73856     (?, 128, 128, 128)  (3, 3, 64, 128) 
256x256/Skip         8192      (?, 128, 128, 128)  (1, 1, 64, 128) 
128x128/Conv0        147584    (?, 128, 128, 128)  (3, 3, 128, 128)
128x128/Conv1_down   295168    (?, 256, 64, 64)    (3, 3, 128, 256)
128x128/Skip         32768     (?, 256, 64, 64)    (1, 1, 128, 256)
64x64/Conv0          590080    (?, 256, 64, 64)    (3, 3, 256, 256)
64x64/Conv1_down     1180160   (?, 512, 32, 32)    (3, 3, 256, 512)
64x64/Skip           131072    (?, 512, 32, 32)    (1, 1, 256, 512)
32x32/Conv0          2359808   (?, 512, 32, 32)    (3, 3, 512, 512)
32x32/Conv1_down     2359808   (?, 512, 16, 16)    (3, 3, 512, 512)
32x32/Skip           262144    (?, 512, 16, 16)    (1, 1, 512, 512)
16x16/Conv0          2359808   (?, 512, 16, 16)    (3, 3, 512, 512)
16x16/Conv1_down     2359808   (?, 512, 8, 8)      (3, 3, 512, 512)
16x16/Skip           262144    (?, 512, 8, 8)      (1, 1, 512, 512)
8x8/Conv0            2359808   (?, 512, 8, 8)      (3, 3, 512, 512)
8x8/Conv1_down       2359808   (?, 512, 4, 4)      (3, 3, 512, 512)
8x8/Skip             262144    (?, 512, 4, 4)      (1, 1, 512, 512)
4x4/MinibatchStddev  -         (?, 513, 4, 4)      -               
4x4/Conv             2364416   (?, 512, 4, 4)      (3, 3, 513, 512)
4x4/Dense0           4194816   (?, 512)            (8192, 512)     
Output               513       (?, 1)              (512, 1)        
scores_out           -         (?, 1)              -               
---                  ---       ---                 ---             
Total                24030753  

Any help or hint would be appreciated. Thank you.

youjin-c commented 2 years ago

This is a bit weird and interesting. I tried to train a 256x256 model instead of 512x512, and blending worked fine.

justinpinkney commented 2 years ago

The number of channels doesn't match, it's 512 in the model you trained, but appears to be 256 for the ffhq model you used. Which FFHQ model did you use? For this to work, you need to fine tune (i.e. resume training) from the FFHQ model, not start from scratch

youjin-c commented 2 years ago

Hello @justinpinkney! Thank you so much for such a quick reply! I used 512x512 ffhq ffhq-512-avg-tpurun1.pkl(faces (FFHQ config-f 512x512)) from your pretrained StyleGAN2 zoo. So these two lists I printed above shows both of the models are 512x512 these are the return of extract_conv_names(model) function of blend_models.py of your StyleGAN2, so

model_1_names = extract_conv_names(model_1) #my trained styleGAN2 512x512
model_2_names = extract_conv_names(model_2) #ffhq-512-avg-tpurun1.pkl 
model_1_names
[('G_synthesis/4x4/Const/const', '4x4', 0, 0), ('G_synthesis/4x4/ToRGB/weight', '4x4', 1, 1), ('G_synthesis/4x4/ToRGB/mod_weight', '4x4', 1, 1), ('G_synthesis/4x4/ToRGB/mod_bias', '4x4', 1, 1), ('G_synthesis/4x4/ToRGB/bias', '4x4', 1, 1), ('G_synthesis/8x8/Conv0_up/weight', '8x8', 0, 2), ('G_synthesis/8x8/Conv0_up/mod_weight', '8x8', 0, 2), ('G_synthesis/8x8/Conv0_up/mod_bias', '8x8', 0, 2), ('G_synthesis/8x8/Conv0_up/noise_strength', '8x8', 0, 2), ('G_synthesis/8x8/Conv0_up/bias', '8x8', 0, 2), ('G_synthesis/8x8/Conv1/weight', '8x8', 1, 3), ('G_synthesis/8x8/Conv1/mod_weight', '8x8', 1, 3), ('G_synthesis/8x8/Conv1/mod_bias', '8x8', 1, 3), ('G_synthesis/8x8/Conv1/noise_strength', '8x8', 1, 3), ('G_synthesis/8x8/Conv1/bias', '8x8', 1, 3), ('G_synthesis/8x8/ToRGB/weight', '8x8', 1, 3), ('G_synthesis/8x8/ToRGB/mod_weight', '8x8', 1, 3), ('G_synthesis/8x8/ToRGB/mod_bias', '8x8', 1, 3), ('G_synthesis/8x8/ToRGB/bias', '8x8', 1, 3), ('G_synthesis/16x16/Conv0_up/weight', '16x16', 0, 4), ('G_synthesis/16x16/Conv0_up/mod_weight', '16x16', 0, 4), ('G_synthesis/16x16/Conv0_up/mod_bias', '16x16', 0, 4), ('G_synthesis/16x16/Conv0_up/noise_strength', '16x16', 0, 4), ('G_synthesis/16x16/Conv0_up/bias', '16x16', 0, 4), ('G_synthesis/16x16/Conv1/weight', '16x16', 1, 5), ('G_synthesis/16x16/Conv1/mod_weight', '16x16', 1, 5), ('G_synthesis/16x16/Conv1/mod_bias', '16x16', 1, 5), ('G_synthesis/16x16/Conv1/noise_strength', '16x16', 1, 5), ('G_synthesis/16x16/Conv1/bias', '16x16', 1, 5), ('G_synthesis/16x16/ToRGB/weight', '16x16', 1, 5), ('G_synthesis/16x16/ToRGB/mod_weight', '16x16', 1, 5), ('G_synthesis/16x16/ToRGB/mod_bias', '16x16', 1, 5), ('G_synthesis/16x16/ToRGB/bias', '16x16', 1, 5), ('G_synthesis/32x32/Conv0_up/weight', '32x32', 0, 6), ('G_synthesis/32x32/Conv0_up/mod_weight', '32x32', 0, 6), ('G_synthesis/32x32/Conv0_up/mod_bias', '32x32', 0, 6), ('G_synthesis/32x32/Conv0_up/noise_strength', '32x32', 0, 6), ('G_synthesis/32x32/Conv0_up/bias', '32x32', 0, 6), ('G_synthesis/32x32/Conv1/weight', '32x32', 1, 7), ('G_synthesis/32x32/Conv1/mod_weight', '32x32', 1, 7), ('G_synthesis/32x32/Conv1/mod_bias', '32x32', 1, 7), ('G_synthesis/32x32/Conv1/noise_strength', '32x32', 1, 7), ('G_synthesis/32x32/Conv1/bias', '32x32', 1, 7), ('G_synthesis/32x32/ToRGB/weight', '32x32', 1, 7), ('G_synthesis/32x32/ToRGB/mod_weight', '32x32', 1, 7), ('G_synthesis/32x32/ToRGB/mod_bias', '32x32', 1, 7), ('G_synthesis/32x32/ToRGB/bias', '32x32', 1, 7), ('G_synthesis/64x64/Conv0_up/weight', '64x64', 0, 8), ('G_synthesis/64x64/Conv0_up/mod_weight', '64x64', 0, 8), ('G_synthesis/64x64/Conv0_up/mod_bias', '64x64', 0, 8), ('G_synthesis/64x64/Conv0_up/noise_strength', '64x64', 0, 8), ('G_synthesis/64x64/Conv0_up/bias', '64x64', 0, 8), ('G_synthesis/64x64/Conv1/weight', '64x64', 1, 9), ('G_synthesis/64x64/Conv1/mod_weight', '64x64', 1, 9), ('G_synthesis/64x64/Conv1/mod_bias', '64x64', 1, 9), ('G_synthesis/64x64/Conv1/noise_strength', '64x64', 1, 9), ('G_synthesis/64x64/Conv1/bias', '64x64', 1, 9), ('G_synthesis/64x64/ToRGB/weight', '64x64', 1, 9), ('G_synthesis/64x64/ToRGB/mod_weight', '64x64', 1, 9), ('G_synthesis/64x64/ToRGB/mod_bias', '64x64', 1, 9), ('G_synthesis/64x64/ToRGB/bias', '64x64', 1, 9), ('G_synthesis/128x128/Conv0_up/weight', '128x128', 0, 10), ('G_synthesis/128x128/Conv0_up/mod_weight', '128x128', 0, 10), ('G_synthesis/128x128/Conv0_up/mod_bias', '128x128', 0, 10), ('G_synthesis/128x128/Conv0_up/noise_strength', '128x128', 0, 10), ('G_synthesis/128x128/Conv0_up/bias', '128x128', 0, 10), ('G_synthesis/128x128/Conv1/weight', '128x128', 1, 11), ('G_synthesis/128x128/Conv1/mod_weight', '128x128', 1, 11), ('G_synthesis/128x128/Conv1/mod_bias', '128x128', 1, 11), ('G_synthesis/128x128/Conv1/noise_strength', '128x128', 1, 11), ('G_synthesis/128x128/Conv1/bias', '128x128', 1, 11), ('G_synthesis/128x128/ToRGB/weight', '128x128', 1, 11), ('G_synthesis/128x128/ToRGB/mod_weight', '128x128', 1, 11), ('G_synthesis/128x128/ToRGB/mod_bias', '128x128', 1, 11), ('G_synthesis/128x128/ToRGB/bias', '128x128', 1, 11), ('G_synthesis/256x256/Conv0_up/weight', '256x256', 0, 12), ('G_synthesis/256x256/Conv0_up/mod_weight', '256x256', 0, 12), ('G_synthesis/256x256/Conv0_up/mod_bias', '256x256', 0, 12), ('G_synthesis/256x256/Conv0_up/noise_strength', '256x256', 0, 12), ('G_synthesis/256x256/Conv0_up/bias', '256x256', 0, 12), ('G_synthesis/256x256/Conv1/weight', '256x256', 1, 13), ('G_synthesis/256x256/Conv1/mod_weight', '256x256', 1, 13), ('G_synthesis/256x256/Conv1/mod_bias', '256x256', 1, 13), ('G_synthesis/256x256/Conv1/noise_strength', '256x256', 1, 13), ('G_synthesis/256x256/Conv1/bias', '256x256', 1, 13), ('G_synthesis/256x256/ToRGB/weight', '256x256', 1, 13), ('G_synthesis/256x256/ToRGB/mod_weight', '256x256', 1, 13), ('G_synthesis/256x256/ToRGB/mod_bias', '256x256', 1, 13), ('G_synthesis/256x256/ToRGB/bias', '256x256', 1, 13), ('G_synthesis/512x512/Conv0_up/weight', '512x512', 0, 14), ('G_synthesis/512x512/Conv0_up/mod_weight', '512x512', 0, 14), ('G_synthesis/512x512/Conv0_up/mod_bias', '512x512', 0, 14), ('G_synthesis/512x512/Conv0_up/noise_strength', '512x512', 0, 14), ('G_synthesis/512x512/Conv0_up/bias', '512x512', 0, 14), ('G_synthesis/512x512/Conv1/weight', '512x512', 1, 15), ('G_synthesis/512x512/Conv1/mod_weight', '512x512', 1, 15), ('G_synthesis/512x512/Conv1/mod_bias', '512x512', 1, 15), ('G_synthesis/512x512/Conv1/noise_strength', '512x512', 1, 15), ('G_synthesis/512x512/Conv1/bias', '512x512', 1, 15), ('G_synthesis/512x512/ToRGB/weight', '512x512', 1, 15), ('G_synthesis/512x512/ToRGB/mod_weight', '512x512', 1, 15), ('G_synthesis/512x512/ToRGB/mod_bias', '512x512', 1, 15), ('G_synthesis/512x512/ToRGB/bias', '512x512', 1, 15)]
model_2_names
[('G_synthesis/4x4/Const/const', '4x4', 0, 0), ('G_synthesis/4x4/ToRGB/weight', '4x4', 1, 1), ('G_synthesis/4x4/ToRGB/mod_weight', '4x4', 1, 1), ('G_synthesis/4x4/ToRGB/mod_bias', '4x4', 1, 1), ('G_synthesis/4x4/ToRGB/bias', '4x4', 1, 1), ('G_synthesis/8x8/Conv0_up/weight', '8x8', 0, 2), ('G_synthesis/8x8/Conv0_up/mod_weight', '8x8', 0, 2), ('G_synthesis/8x8/Conv0_up/mod_bias', '8x8', 0, 2), ('G_synthesis/8x8/Conv0_up/noise_strength', '8x8', 0, 2), ('G_synthesis/8x8/Conv0_up/bias', '8x8', 0, 2), ('G_synthesis/8x8/Conv1/weight', '8x8', 1, 3), ('G_synthesis/8x8/Conv1/mod_weight', '8x8', 1, 3), ('G_synthesis/8x8/Conv1/mod_bias', '8x8', 1, 3), ('G_synthesis/8x8/Conv1/noise_strength', '8x8', 1, 3), ('G_synthesis/8x8/Conv1/bias', '8x8', 1, 3), ('G_synthesis/8x8/ToRGB/weight', '8x8', 1, 3), ('G_synthesis/8x8/ToRGB/mod_weight', '8x8', 1, 3), ('G_synthesis/8x8/ToRGB/mod_bias', '8x8', 1, 3), ('G_synthesis/8x8/ToRGB/bias', '8x8', 1, 3), ('G_synthesis/16x16/Conv0_up/weight', '16x16', 0, 4), ('G_synthesis/16x16/Conv0_up/mod_weight', '16x16', 0, 4), ('G_synthesis/16x16/Conv0_up/mod_bias', '16x16', 0, 4), ('G_synthesis/16x16/Conv0_up/noise_strength', '16x16', 0, 4), ('G_synthesis/16x16/Conv0_up/bias', '16x16', 0, 4), ('G_synthesis/16x16/Conv1/weight', '16x16', 1, 5), ('G_synthesis/16x16/Conv1/mod_weight', '16x16', 1, 5), ('G_synthesis/16x16/Conv1/mod_bias', '16x16', 1, 5), ('G_synthesis/16x16/Conv1/noise_strength', '16x16', 1, 5), ('G_synthesis/16x16/Conv1/bias', '16x16', 1, 5), ('G_synthesis/16x16/ToRGB/weight', '16x16', 1, 5), ('G_synthesis/16x16/ToRGB/mod_weight', '16x16', 1, 5), ('G_synthesis/16x16/ToRGB/mod_bias', '16x16', 1, 5), ('G_synthesis/16x16/ToRGB/bias', '16x16', 1, 5), ('G_synthesis/32x32/Conv0_up/weight', '32x32', 0, 6), ('G_synthesis/32x32/Conv0_up/mod_weight', '32x32', 0, 6), ('G_synthesis/32x32/Conv0_up/mod_bias', '32x32', 0, 6), ('G_synthesis/32x32/Conv0_up/noise_strength', '32x32', 0, 6), ('G_synthesis/32x32/Conv0_up/bias', '32x32', 0, 6), ('G_synthesis/32x32/Conv1/weight', '32x32', 1, 7), ('G_synthesis/32x32/Conv1/mod_weight', '32x32', 1, 7), ('G_synthesis/32x32/Conv1/mod_bias', '32x32', 1, 7), ('G_synthesis/32x32/Conv1/noise_strength', '32x32', 1, 7), ('G_synthesis/32x32/Conv1/bias', '32x32', 1, 7), ('G_synthesis/32x32/ToRGB/weight', '32x32', 1, 7), ('G_synthesis/32x32/ToRGB/mod_weight', '32x32', 1, 7), ('G_synthesis/32x32/ToRGB/mod_bias', '32x32', 1, 7), ('G_synthesis/32x32/ToRGB/bias', '32x32', 1, 7), ('G_synthesis/64x64/Conv0_up/weight', '64x64', 0, 8), ('G_synthesis/64x64/Conv0_up/mod_weight', '64x64', 0, 8), ('G_synthesis/64x64/Conv0_up/mod_bias', '64x64', 0, 8), ('G_synthesis/64x64/Conv0_up/noise_strength', '64x64', 0, 8), ('G_synthesis/64x64/Conv0_up/bias', '64x64', 0, 8), ('G_synthesis/64x64/Conv1/weight', '64x64', 1, 9), ('G_synthesis/64x64/Conv1/mod_weight', '64x64', 1, 9), ('G_synthesis/64x64/Conv1/mod_bias', '64x64', 1, 9), ('G_synthesis/64x64/Conv1/noise_strength', '64x64', 1, 9), ('G_synthesis/64x64/Conv1/bias', '64x64', 1, 9), ('G_synthesis/64x64/ToRGB/weight', '64x64', 1, 9), ('G_synthesis/64x64/ToRGB/mod_weight', '64x64', 1, 9), ('G_synthesis/64x64/ToRGB/mod_bias', '64x64', 1, 9), ('G_synthesis/64x64/ToRGB/bias', '64x64', 1, 9), ('G_synthesis/128x128/Conv0_up/weight', '128x128', 0, 10), ('G_synthesis/128x128/Conv0_up/mod_weight', '128x128', 0, 10), ('G_synthesis/128x128/Conv0_up/mod_bias', '128x128', 0, 10), ('G_synthesis/128x128/Conv0_up/noise_strength', '128x128', 0, 10), ('G_synthesis/128x128/Conv0_up/bias', '128x128', 0, 10), ('G_synthesis/128x128/Conv1/weight', '128x128', 1, 11), ('G_synthesis/128x128/Conv1/mod_weight', '128x128', 1, 11), ('G_synthesis/128x128/Conv1/mod_bias', '128x128', 1, 11), ('G_synthesis/128x128/Conv1/noise_strength', '128x128', 1, 11), ('G_synthesis/128x128/Conv1/bias', '128x128', 1, 11), ('G_synthesis/128x128/ToRGB/weight', '128x128', 1, 11), ('G_synthesis/128x128/ToRGB/mod_weight', '128x128', 1, 11), ('G_synthesis/128x128/ToRGB/mod_bias', '128x128', 1, 11), ('G_synthesis/128x128/ToRGB/bias', '128x128', 1, 11), ('G_synthesis/256x256/Conv0_up/weight', '256x256', 0, 12), ('G_synthesis/256x256/Conv0_up/mod_weight', '256x256', 0, 12), ('G_synthesis/256x256/Conv0_up/mod_bias', '256x256', 0, 12), ('G_synthesis/256x256/Conv0_up/noise_strength', '256x256', 0, 12), ('G_synthesis/256x256/Conv0_up/bias', '256x256', 0, 12), ('G_synthesis/256x256/Conv1/weight', '256x256', 1, 13), ('G_synthesis/256x256/Conv1/mod_weight', '256x256', 1, 13), ('G_synthesis/256x256/Conv1/mod_bias', '256x256', 1, 13), ('G_synthesis/256x256/Conv1/noise_strength', '256x256', 1, 13), ('G_synthesis/256x256/Conv1/bias', '256x256', 1, 13), ('G_synthesis/256x256/ToRGB/weight', '256x256', 1, 13), ('G_synthesis/256x256/ToRGB/mod_weight', '256x256', 1, 13), ('G_synthesis/256x256/ToRGB/mod_bias', '256x256', 1, 13), ('G_synthesis/256x256/ToRGB/bias', '256x256', 1, 13), ('G_synthesis/512x512/Conv0_up/weight', '512x512', 0, 14), ('G_synthesis/512x512/Conv0_up/mod_weight', '512x512', 0, 14), ('G_synthesis/512x512/Conv0_up/mod_bias', '512x512', 0, 14), ('G_synthesis/512x512/Conv0_up/noise_strength', '512x512', 0, 14), ('G_synthesis/512x512/Conv0_up/bias', '512x512', 0, 14), ('G_synthesis/512x512/Conv1/weight', '512x512', 1, 15), ('G_synthesis/512x512/Conv1/mod_weight', '512x512', 1, 15), ('G_synthesis/512x512/Conv1/mod_bias', '512x512', 1, 15), ('G_synthesis/512x512/Conv1/noise_strength', '512x512', 1, 15), ('G_synthesis/512x512/Conv1/bias', '512x512', 1, 15), ('G_synthesis/512x512/ToRGB/weight', '512x512', 1, 15), ('G_synthesis/512x512/ToRGB/mod_weight', '512x512', 1, 15), ('G_synthesis/512x512/ToRGB/mod_bias', '512x512', 1, 15), ('G_synthesis/512x512/ToRGB/bias', '512x512', 1, 15)]

So I think I matched the resolution of the two models. But let me double-check to make sure.

I also tried a 256x256 custom model with faces (FFHQ config-e 256x256) model yesterday, and it worked fine.

justinpinkney commented 2 years ago

The model you used is config f. The one you trained is config e. These are the same resolution, but different number of channels (inside the network) it's a bit confusing as the numbers are also 256 and 512.

youjin-c commented 2 years ago

I got it!!!! so I need to train with --config=config-f to blend with the faces (FFHQ config-f 512x512) model. Thank you so much @justinpinkney, Things got much clear now. You are the best!

Best, Youjin