harlanhong / CVPR2022-DaGAN

Official code for CVPR2022 paper: Depth-Aware Generative Adversarial Network for Talking Head Video Generation
https://harlanhong.github.io/publications/dagan.html
Other
958 stars 125 forks source link

testing error #27

Closed Ha0Tang closed 1 year ago

Ha0Tang commented 2 years ago

when i run this command CUDA_VISIBLE_DEVICES=0 python demo.py --config config/vox-adv-256.yaml --driving_video ./example_video.mp4 --source_image ./example_image.png --checkpoint ./checkpoints/SPADE_DaGAN_vox_adv_256.pth.tar --relative --adapt_scale --kp_num 15 --generator SPADEDepthAwareGenerator --result_video results/example_out.mp4 --find_best_frame

I got the following error: Traceback (most recent call last): File "demo.py", line 169, in depth_encoder.load_state_dict(filtered_dict_enc) File "/home/miniconda3/envs/dagan/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1407, in load_state_dict self.class.name, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for ResnetEncoder: size mismatch for encoder.layer1.0.conv1.weight: copying a param with shape torch.Size([64, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]). size mismatch for encoder.layer1.1.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]). size mismatch for encoder.layer2.0.conv1.weight: copying a param with shape torch.Size([128, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 64, 3, 3]). size mismatch for encoder.layer2.0.downsample.0.weight: copying a param with shape torch.Size([512, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 64, 1, 1]). size mismatch for encoder.layer2.0.downsample.1.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for encoder.layer2.0.downsample.1.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for encoder.layer2.0.downsample.1.running_mean: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for encoder.layer2.0.downsample.1.running_var: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for encoder.layer2.1.conv1.weight: copying a param with shape torch.Size([128, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]). size mismatch for encoder.layer3.0.conv1.weight: copying a param with shape torch.Size([256, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 128, 3, 3]). size mismatch for encoder.layer3.0.downsample.0.weight: copying a param with shape torch.Size([1024, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 128, 1, 1]). size mismatch for encoder.layer3.0.downsample.1.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for encoder.layer3.0.downsample.1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for encoder.layer3.0.downsample.1.running_mean: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for encoder.layer3.0.downsample.1.running_var: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for encoder.layer3.1.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 256, 3, 3]). size mismatch for encoder.layer4.0.conv1.weight: copying a param with shape torch.Size([512, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 256, 3, 3]). size mismatch for encoder.layer4.0.downsample.0.weight: copying a param with shape torch.Size([2048, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 256, 1, 1]). size mismatch for encoder.layer4.0.downsample.1.weight: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for encoder.layer4.0.downsample.1.bias: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for encoder.layer4.0.downsample.1.running_mean: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for encoder.layer4.0.downsample.1.running_var: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for encoder.layer4.1.conv1.weight: copying a param with shape torch.Size([512, 2048, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 512, 3, 3]). size mismatch for encoder.fc.weight: copying a param with shape torch.Size([1000, 2048]) from checkpoint, the shape in current model is torch.Size([1000, 512]).

harlanhong commented 2 years ago

Sorry, my mistake. If you are using the depth model of depth_face_model_Voxceleb2_10w, you can set this line https://github.com/harlanhong/CVPR2022-DaGAN/blob/c594a2eb43d9f74fac4f783de26aa7ee3e5cbe35/demo.py#L164 to 50 layers, that is : depth_encoder = depth.ResnetEncoder(50, False)

muxiddin19 commented 2 years ago

Sorry, my mistake. If you are using the depth model of depth_face_model_Voxceleb2_10w, you can set this line

https://github.com/harlanhong/CVPR2022-DaGAN/blob/c594a2eb43d9f74fac4f783de26aa7ee3e5cbe35/demo.py#L164

to 50 layers, that is : depth_encoder = depth.ResnetEncoder(50, False)

Thanks for your great job! When I ran both DepAwareGenerator and DepAwareGenerator with 18 layers, it was run without error. However, when running SpadeDepAwareGenerator with 50 layers, there was an error. I am uploading some captures from the screen. Please can you give some tips for the solution? image image image

harlanhong commented 2 years ago

My mistake. Let me make it more clear.

There are two types of depth checkpoints in the onedrive: image

if you use the first one, you should rewrite the code as: depth_encoder = depth.ResnetEncoder(18, False)

while you use the second one, depth_face_model_Voxceleb2_10w, you should use 50 layers architecture: depth_encoder = depth.ResnetEncoder(50, False)

Thanks.