inference.py RuntimeError: Error(s) in loading state_dict for Wav2Lip:

(venv) F:\work\kouxing>python inference-.py --checkpoint_path SaveModels/disc_checkpoint_step000003000.pth --face input/0710-0000004.mp4 --audio input/100001.wav --static False Using cuda for inference. Reading video frames... Number of frames available for inference: 155 (80, 6692) Length of mel chunks: 2005 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:03<00:00, 3.83s/it] Load checkpoint from: SaveModels/disc_checkpoint_step000003000.pth███████████████████████████████████████████████████████| 1/1 [00:03<00:00, 3.83s/it] 0%| | 0/16 [00:04<?, ?it/s] Traceback (most recent call last): File "F:\work\kouxing\inference-.py", line 301, in main() File "F:\work\kouxing\inference-.py", line 264, in main model = load_model(args.checkpoint_path) File "F:\work\kouxing\inference-.py", line 186, in load_model model.load_state_dict(new_s) File "F:\work\kouxing\venv\lib\site-packages\torch\nn\modules\module.py", line 2041, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for Wav2Lip: Missing key(s) in state_dict: "face_encoder_blocks.0.0.conv_block.1.weight", "face_encoder_blocks.0.0.conv_block.1.bias", "face_encoder_blocks.0 .0.conv_block.1.running_mean", "face_encoder_blocks.0.0.conv_block.1.running_var", "face_encoder_blocks.0.0.act.weight", "face_encoder_blocks.1.0.conv_b lock.1.weight", "face_encoder_blocks.1.0.conv_block.1.bias", "face_encoder_blocks.1.0.conv_block.1.running_mean", "face_encoder_blocks.1.0.conv_block.1. running_var", "face_encoder_blocks.1.0.act.weight", "face_encoder_blocks.1.1.conv_block.1.weight", "face_encoder_blocks.1.1.conv_block.1.bias", "face_en coder_blocks.1.1.conv_block.1.running_mean", "face_encoder_blocks.1.1.conv_block.1.running_var", "face_encoder_blocks.1.1.act.weight", "face_encoder_blo cks.2.0.conv_block.1.weight", "face_encoder_blocks.2.0.conv_block.1.bias", "face_encoder_blocks.2.0.conv_block.1.running_mean", "face_encoder_blocks.2.0 .conv_block.1.running_var", "face_encoder_blocks.2.0.act.weight", "face_encoder_blocks.2.1.conv_block.1.weight", "face_encoder_blocks.2.1.conv_block.1.b ias", "face_encoder_blocks.2.1.conv_block.1.running_mean", "face_encoder_blocks.2.1.conv_block.1.running_var", "face_encoder_blocks.2.1.act.weight", "fa ce_encoder_blocks.2.2.conv_block.0.weight", "face_encoder_blocks.2.2.conv_block.0.bias", "face_encoder_blocks.2.2.conv_block.1.weight", "face_encoder_bl ocks.2.2.conv_block.1.bias", "face_encoder_blocks.2.2.conv_block.1.running_mean", "face_encoder_blocks.2.2.conv_block.1.running_var", "face_encoder_bloc ks.2.2.act.weight", "face_encoder_blocks.3.0.conv_block.1.weight", "face_encoder_blocks.3.0.conv_block.1.bias", "face_encoder_blocks.3.0.conv_block.1.ru nning_mean", "face_encoder_blocks.3.0.conv_block.1.running_var", "face_encoder_blocks.3.0.act.weight", "face_encoder_blocks.3.1.conv_block.1.weight", "f ace_encoder_blocks.3.1.conv_block.1.bias", "face_encoder_blocks.3.1.conv_block.1.running_mean", "face_encoder_blocks.3.1.conv_block.1.running_var", "fac e_encoder_blocks.3.1.act.weight", "face_encoder_blocks.3.2.conv_block.0.weight", "face_encoder_blocks.3.2.conv_block.0.bias", "face_encoder_blocks.3.2.c onv_block.1.weight", "face_encoder_blocks.3.2.conv_block.1.bias", "face_encoder_blocks.3.2.conv_block.1.running_mean", "face_encoder_blocks.3.2.conv_blo ck.1.running_var", "face_encoder_blocks.3.2.act.weight", "face_encoder_blocks.3.3.conv_block.0.weight", "face_encoder_blocks.3.3.conv_block.0.bias", "fa ce_encoder_blocks.3.3.conv_block.1.weight", "face_encoder_blocks.3.3.conv_block.1.bias", "face_encoder_blocks.3.3.conv_block.1.running_mean", "face_enco der_blocks.3.3.conv_block.1.running_var", "face_encoder_blocks.3.3.act.weight", "face_encoder_blocks.4.0.conv_block.1.weight", "face_encoder_blocks.4.0. conv_block.1.bias", "face_encoder_blocks.4.0.conv_block.1.running_mean", "face_encoder_blocks.4.0.conv_block.1.running_var", "face_encoder_blocks.4.0.ac t.weight", "face_encoder_blocks.4.1.conv_block.1.weight", "face_encoder_blocks.4.1.conv_block.1.bias", "face_encoder_blocks.4.1.conv_block.1.running_mea n", "face_encoder_blocks.4.1.conv_block.1.running_var", "face_encoder_blocks.4.1.act.weight", "face_encoder_blocks.4.2.conv_block.0.weight", "face_encod er_blocks.4.2.conv_block.0.bias", "face_encoder_blocks.4.2.conv_block.1.weight", "face_encoder_blocks.4.2.conv_block.1.bias", "face_encoder_blocks.4.2.c onv_block.1.running_mean", "face_encoder_blocks.4.2.conv_block.1.running_var", "face_encoder_blocks.4.2.act.weight", "face_encoder_blocks.5.0.conv_block .1.weight", "face_encoder_blocks.5.0.conv_block.1.bias", "face_encoder_blocks.5.0.conv_block.1.running_mean", "face_encoder_blocks.5.0.conv_block.1.runn ing_var", "face_encoder_blocks.5.0.act.weight", "face_encoder_blocks.5.1.conv_block.1.weight", "face_encoder_blocks.5.1.conv_block.1.bias", "face_encode r_blocks.5.1.conv_block.1.running_mean", "face_encoder_blocks.5.1.conv_block.1.running_var", "face_encoder_blocks.5.1.act.weight", "face_encoder_blocks. 5.2.conv_block.0.weight", "face_encoder_blocks.5.2.conv_block.0.bias", "face_encoder_blocks.5.2.conv_block.1.weight", "face_encoder_blocks.5.2.conv_bloc k.1.bias", "face_encoder_blocks.5.2.conv_block.1.running_mean", "face_encoder_blocks.5.2.conv_block.1.running_var", "face_encoder_blocks.5.2.act.weight" , "face_encoder_blocks.6.0.conv_block.1.weight", "face_encoder_blocks.6.0.conv_block.1.bias", "face_encoder_blocks.6.0.conv_block.1.running_mean", "face _encoder_blocks.6.0.conv_block.1.running_var", "face_encoder_blocks.6.0.act.weight", "face_encoder_blocks.6.1.conv_block.1.weight", "face_encoder_blocks .6.1.conv_block.1.bias", "face_encoder_blocks.6.1.conv_block.1.running_mean", "face_encoder_blocks.6.1.conv_block.1.running_var", "face_encoder_blocks.6 .1.act.weight", "face_encoder_blocks.7.0.conv_block.1.weight", "face_encoder_blocks.7.0.conv_block.1.bias", "face_encoder_blocks.7.0.conv_block.1.runnin g_mean", "face_encoder_blocks.7.0.conv_block.1.running_var", "face_encoder_blocks.7.0.act.weight", "face_encoder_blocks.7.1.convblock.1.weight", "face encoder_blocks.7.1.conv_block.1.bias", "face_encoder_blocks.7.1.conv_block.1.running_mean", "face_encoder_blocks.7.1.conv_block.1.running_var", "face_en coder_blocks.7.1.act.weight", "face_encoder_blocks.8.0.conv_block.1.weight", "face_encoder_blocks.8.0.conv_block.1.bias", "face_encoderblocks.8.0.conv block.1.running_mean", "face_encoder_blocks.8.0.conv_block.1.running_var", "face_encoder_blocks.8.0.act.weight", "face_encoder_blocks.8.1.conv_block.1.w eight", "face_encoder_blocks.8.1.conv_block.1.bias", "face_encoder_blocks.8.1.conv_block.1.running_mean", "face_encoder_blocks.8.1.convblock.1.running var", "face_encoder_blocks.8.1.act.weight", "audio_encoder.0.conv_block.0.weight", "audio_encoder.0.conv_block.0.bias", "audio_encoder.0.conv_block.1.we ight", "audio_encoder.0.conv_block.1.bias", "audio_encoder.0.conv_block.1.running_mean", "audio_encoder.0.conv_block.1.running_var", "audio_encoder.0.ac t.weight", "audio_encoder.1.conv_block.0.weight", "audio_encoder.1.conv_block.0.bias", "audio_encoder.1.conv_block.1.weight", "audio_encoder.1.conv_bloc k.1.bias", "audio_encoder.1.conv_block.1.running_mean", "audio_encoder.1.conv_block.1.running_var", "audio_encoder.1.act.weight", "audioencoder.2.conv block.0.weight", "audio_encoder.2.conv_block.0.bias", "audio_encoder.2.conv_block.1.weight", "audio_encoder.2.conv_block.1.bias", "audioencoder.2.conv block.1.running_mean", "audio_encoder.2.conv_block.1.running_var", "audio_encoder.2.act.weight", "audio_encoder.3.conv_block.0.weight", "audio_encoder.3 .conv_block.0.bias", "audio_encoder.3.conv_block.1.weight", "audio_encoder.3.conv_block.1.bias", "audio_encoder.3.conv_block.1.running_mean", "audio_enc oder.3.conv_block.1.running_var", "audio_encoder.3.act.weight", "audio_encoder.4.conv_block.0.weight", "audio_encoder.4.conv_block.0.bias", "audio_encod er.4.conv_block.1.weight", "audio_encoder.4.conv_block.1.bias", "audio_encoder.4.conv_block.1.running_mean", "audio_encoder.4.conv_block.1.running_var", "audio_encoder.4.act.weight", "audio_encoder.5.conv_block.0.weight", "audio_encoder.5.conv_block.0.bias", "audio_encoder.5.conv_block.1.weight", "audio _encoder.5.conv_block.1.bias", "audio_encoder.5.conv_block.1.running_mean", "audio_encoder.5.conv_block.1.running_var", "audio_encoder.5.act.weight", "a udio_encoder.6.conv_block.0.weight", "audio_encoder.6.conv_block.0.bias", "audio_encoder.6.conv_block.1.weight", "audio_encoder.6.conv_block.1.bias", "a udio_encoder.6.conv_block.1.running_mean", "audio_encoder.6.conv_block.1.running_var", "audio_encoder.6.act.weight", "audio_encoder.7.conv_block.0.weigh t", "audio_encoder.7.conv_block.0.bias", "audio_encoder.7.conv_block.1.weight", "audio_encoder.7.conv_block.1.bias", "audio_encoder.7.conv_block.1.runni ng_mean", "audio_encoder.7.conv_block.1.running_var", "audio_encoder.7.act.weight", "audio_encoder.8.conv_block.0.weight", "audio_encoder.8.conv_block.0 .bias", "audio_encoder.8.conv_block.1.weight", "audio_encoder.8.conv_block.1.bias", "audio_encoder.8.conv_block.1.running_mean", "audio_encoder.8.conv_b lock.1.running_var", "audio_encoder.8.act.weight", "audio_encoder.9.conv_block.0.weight", "audio_encoder.9.conv_block.0.bias", "audio_encoder.9.conv_blo ck.1.weight", "audio_encoder.9.conv_block.1.bias", "audio_encoder.9.conv_block.1.running_mean", "audio_encoder.9.conv_block.1.running_var", "audio_encod er.9.act.weight", "audio_encoder.10.conv_block.0.weight", "audio_encoder.10.conv_block.0.bias", "audio_encoder.10.conv_block.1.weight", "audio_encoder.1 0.conv_block.1.bias", "audio_encoder.10.conv_block.1.running_mean", "audio_encoder.10.conv_block.1.running_var", "audio_encoder.10.act.weight", "audio_e ncoder.11.conv_block.0.weight", "audio_encoder.11.conv_block.0.bias", "audio_encoder.11.conv_block.1.weight", "audio_encoder.11.conv_block.1.bias", "aud io_encoder.11.conv_block.1.running_mean", "audio_encoder.11.conv_block.1.running_var", "audio_encoder.11.act.weight", "audio_encoder.12.conv_block.0.wei ght", "audio_encoder.12.conv_block.0.bias", "audio_encoder.12.conv_block.1.weight", "audio_encoder.12.conv_block.1.bias", "audio_encoder.12.conv_block.1 .running_mean", "audio_encoder.12.conv_block.1.running_var", "audio_encoder.12.act.weight", "face_decoder_blocks.0.0.conv_block.0.weight", "face_decoder _blocks.0.0.conv_block.0.bias", "face_decoder_blocks.0.0.conv_block.1.weight", "face_decoder_blocks.0.0.conv_block.1.bias", "face_decoder_blocks.0.0.con v_block.1.running_mean", "face_decoder_blocks.0.0.conv_block.1.running_var", "face_decoder_blocks.0.0.act.weight", "face_decoder_blocks.1.0.conv_block.0 .weight", "face_decoder_blocks.1.0.conv_block.0.bias", "face_decoder_blocks.1.0.conv_block.1.weight", "face_decoder_blocks.1.0.conv_block.1.bias", "face _decoder_blocks.1.0.conv_block.1.running_mean", "face_decoder_blocks.1.0.conv_block.1.running_var", "face_decoder_blocks.1.0.act.weight", "facedecoder blocks.1.1.conv_block.0.weight", "face_decoder_blocks.1.1.conv_block.0.bias", "face_decoder_blocks.1.1.conv_block.1.weight", "face_decoder_blocks.1.1.co nv_block.1.bias", "face_decoder_blocks.1.1.conv_block.1.running_mean", "face_decoder_blocks.1.1.conv_block.1.running_var", "face_decoder_blocks.1.1.act. weight", "face_decoder_blocks.2.0.conv_block.0.weight", "face_decoder_blocks.2.0.conv_block.0.bias", "face_decoder_blocks.2.0.conv_block.1.weight", "fac e_decoder_blocks.2.0.conv_block.1.bias", "face_decoder_blocks.2.0.conv_block.1.running_mean", "face_decoder_blocks.2.0.conv_block.1.runningvar", "face decoder_blocks.2.0.act.weight", "face_decoder_blocks.2.1.conv_block.0.weight", "face_decoder_blocks.2.1.conv_block.0.bias", "face_decoder_blocks.2.1.con v_block.1.weight", "face_decoder_blocks.2.1.conv_block.1.bias", "face_decoder_blocks.2.1.conv_block.1.running_mean", "face_decoder_blocks.2.1.conv_block .1.running_var", "face_decoder_blocks.2.1.act.weight", "face_decoder_blocks.2.2.conv_block.0.weight", "face_decoder_blocks.2.2.conv_block.0.bias", "face _decoder_blocks.2.2.conv_block.1.weight", "face_decoder_blocks.2.2.conv_block.1.bias", "face_decoder_blocks.2.2.conv_block.1.running_mean", "face_decode r_blocks.2.2.conv_block.1.running_var", "face_decoder_blocks.2.2.act.weight", "face_decoder_blocks.3.0.conv_block.0.weight", "face_decoder_blocks.3.0.co nv_block.0.bias", "face_decoder_blocks.3.0.conv_block.1.weight", "face_decoder_blocks.3.0.conv_block.1.bias", "face_decoder_blocks.3.0.conv_block.1.runn ing_mean", "face_decoder_blocks.3.0.conv_block.1.running_var", "face_decoder_blocks.3.0.act.weight", "face_decoder_blocks.3.1.conv_block.0.weight", "fac e_decoder_blocks.3.1.conv_block.0.bias", "face_decoder_blocks.3.1.conv_block.1.weight", "face_decoder_blocks.3.1.conv_block.1.bias", "face_decoder_block s.3.1.conv_block.1.running_mean", "face_decoder_blocks.3.1.conv_block.1.running_var", "face_decoder_blocks.3.1.act.weight", "face_decoder_blocks.3.2.con v_block.0.weight", "face_decoder_blocks.3.2.conv_block.0.bias", "face_decoder_blocks.3.2.conv_block.1.weight", "face_decoder_blocks.3.2.conv_block.1.bia s", "face_decoder_blocks.3.2.conv_block.1.running_mean", "face_decoder_blocks.3.2.conv_block.1.running_var", "face_decoder_blocks.3.2.act.weight", "face _decoder_blocks.4.0.conv_block.0.weight", "face_decoder_blocks.4.0.conv_block.0.bias", "face_decoder_blocks.4.0.conv_block.1.weight", "face_decoder_bloc ks.4.0.conv_block.1.bias", "face_decoder_blocks.4.0.conv_block.1.running_mean", "face_decoder_blocks.4.0.conv_block.1.running_var", "face_decoder_blocks .4.0.act.weight", "face_decoder_blocks.4.1.conv_block.0.weight", "face_decoder_blocks.4.1.conv_block.0.bias", "face_decoder_blocks.4.1.conv_block.1.weig ht", "face_decoder_blocks.4.1.conv_block.1.bias", "face_decoder_blocks.4.1.conv_block.1.running_mean", "face_decoder_blocks.4.1.conv_block.1.running_var ", "face_decoder_blocks.4.1.act.weight", "face_decoder_blocks.4.2.conv_block.0.weight", "face_decoder_blocks.4.2.conv_block.0.bias", "face_decoder_block s.4.2.conv_block.1.weight", "face_decoder_blocks.4.2.conv_block.1.bias", "face_decoder_blocks.4.2.conv_block.1.running_mean", "face_decoder_blocks.4.2.c onv_block.1.running_var", "face_decoder_blocks.4.2.act.weight", "face_decoder_blocks.5.0.conv_block.0.weight", "face_decoder_blocks.5.0.conv_block.0.bia s", "face_decoder_blocks.5.0.conv_block.1.weight", "face_decoder_blocks.5.0.conv_block.1.bias", "face_decoder_blocks.5.0.conv_block.1.running_mean", "fa ce_decoder_blocks.5.0.conv_block.1.running_var", "face_decoder_blocks.5.0.act.weight", "face_decoder_blocks.5.1.conv_block.0.weight", "face_decoder_bloc ks.5.1.conv_block.0.bias", "face_decoder_blocks.5.1.conv_block.1.weight", "face_decoder_blocks.5.1.conv_block.1.bias", "face_decoder_blocks.5.1.conv_blo ck.1.running_mean", "face_decoder_blocks.5.1.conv_block.1.running_var", "face_decoder_blocks.5.1.act.weight", "face_decoder_blocks.5.2.conv_block.0.weig ht", "face_decoder_blocks.5.2.conv_block.0.bias", "face_decoder_blocks.5.2.conv_block.1.weight", "face_decoder_blocks.5.2.conv_block.1.bias", "face_deco der_blocks.5.2.conv_block.1.running_mean", "face_decoder_blocks.5.2.conv_block.1.running_var", "face_decoder_blocks.5.2.act.weight", "face_decoder_block s.6.0.conv_block.0.weight", "face_decoder_blocks.6.0.conv_block.0.bias", "face_decoder_blocks.6.0.conv_block.1.weight", "face_decoder_blocks.6.0.conv_bl ock.1.bias", "face_decoder_blocks.6.0.conv_block.1.running_mean", "face_decoder_blocks.6.0.conv_block.1.running_var", "face_decoder_blocks.6.0.act.weigh t", "face_decoder_blocks.6.1.conv_block.0.weight", "face_decoder_blocks.6.1.conv_block.0.bias", "face_decoder_blocks.6.1.conv_block.1.weight", "face_dec oder_blocks.6.1.conv_block.1.bias", "face_decoder_blocks.6.1.conv_block.1.running_mean", "face_decoder_blocks.6.1.conv_block.1.running_var", "face_decod er_blocks.6.1.act.weight", "face_decoder_blocks.6.2.conv_block.0.weight", "face_decoder_blocks.6.2.conv_block.0.bias", "face_decoder_blocks.6.2.conv_blo ck.1.weight", "face_decoder_blocks.6.2.conv_block.1.bias", "face_decoder_blocks.6.2.conv_block.1.running_mean", "face_decoder_blocks.6.2.conv_block.1.ru nning_var", "face_decoder_blocks.6.2.act.weight", "face_decoder_blocks.7.0.conv_block.0.weight", "face_decoder_blocks.7.0.conv_block.0.bias", "face_deco der_blocks.7.0.conv_block.1.weight", "face_decoder_blocks.7.0.conv_block.1.bias", "face_decoder_blocks.7.0.conv_block.1.running_mean", "face_decoder_blo cks.7.0.conv_block.1.running_var", "face_decoder_blocks.7.0.act.weight", "face_decoder_blocks.7.1.conv_block.0.weight", "face_decoder_blocks.7.1.conv_bl ock.0.bias", "face_decoder_blocks.7.1.conv_block.1.weight", "face_decoder_blocks.7.1.conv_block.1.bias", "face_decoder_blocks.7.1.conv_block.1.running_m ean", "face_decoder_blocks.7.1.conv_block.1.running_var", "face_decoder_blocks.7.1.act.weight", "face_decoder_blocks.7.2.conv_block.0.weight", "face_dec oder_blocks.7.2.conv_block.0.bias", "face_decoder_blocks.7.2.conv_block.1.weight", "face_decoder_blocks.7.2.conv_block.1.bias", "face_decoder_blocks.7.2 .conv_block.1.running_mean", "face_decoder_blocks.7.2.conv_block.1.running_var", "face_decoder_blocks.7.2.act.weight", "face_decoder_blocks.8.0.conv_blo ck.0.weight", "face_decoder_blocks.8.0.conv_block.0.bias", "face_decoder_blocks.8.0.conv_block.1.weight", "face_decoder_blocks.8.0.conv_block.1.bias", " face_decoder_blocks.8.0.conv_block.1.running_mean", "face_decoder_blocks.8.0.conv_block.1.running_var", "face_decoder_blocks.8.0.act.weight", "face_deco der_blocks.8.1.conv_block.0.weight", "face_decoder_blocks.8.1.conv_block.0.bias", "face_decoder_blocks.8.1.conv_block.1.weight", "face_decoder_blocks.8. 1.conv_block.1.bias", "face_decoder_blocks.8.1.conv_block.1.running_mean", "face_decoder_blocks.8.1.conv_block.1.running_var", "face_decoder_blocks.8.1. act.weight", "face_decoder_blocks.8.2.conv_block.0.weight", "face_decoder_blocks.8.2.conv_block.0.bias", "face_decoder_blocks.8.2.conv_block.1.weight", "face_decoder_blocks.8.2.conv_block.1.bias", "face_decoder_blocks.8.2.conv_block.1.running_mean", "face_decoder_blocks.8.2.conv_block.1.running_var", "f ace_decoder_blocks.8.2.act.weight", "output_block.0.conv_block.0.weight", "output_block.0.conv_block.0.bias", "output_block.0.conv_block.1.weight", "out put_block.0.conv_block.1.bias", "output_block.0.conv_block.1.running_mean", "output_block.0.conv_block.1.running_var", "output_block.0.act.weight", "output_block.1.weight", "output_block.1.bias". Unexpected key(s) in state_dict: "binary_pred.0.weight", "binary_pred.0.bias". size mismatch for face_encoder_blocks.0.0.conv_block.0.weight: copying a param with shape torch.Size([32, 3, 7, 7]) from checkpoint, the shape in current model is torch.Size([16, 6, 7, 7]). size mismatch for face_encoder_blocks.0.0.conv_block.0.bias: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([16]). size mismatch for face_encoder_blocks.1.0.conv_block.0.weight: copying a param with shape torch.Size([64, 32, 5, 5]) from checkpoint, the shape in current model is torch.Size([32, 16, 5, 5]). size mismatch for face_encoder_blocks.1.0.conv_block.0.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([32]). size mismatch for face_encoder_blocks.1.1.conv_block.0.weight: copying a param with shape torch.Size([64, 64, 5, 5]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]). size mismatch for face_encoder_blocks.1.1.conv_block.0.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([32]). size mismatch for face_encoder_blocks.2.0.conv_block.0.weight: copying a param with shape torch.Size([128, 64, 5, 5]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]). size mismatch for face_encoder_blocks.2.0.conv_block.0.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]). size mismatch for face_encoder_blocks.2.1.conv_block.0.weight: copying a param with shape torch.Size([128, 128, 5, 5]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]). size mismatch for face_encoder_blocks.2.1.conv_block.0.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]). size mismatch for face_encoder_blocks.3.0.conv_block.0.weight: copying a param with shape torch.Size([128, 128, 5, 5]) from checkpoint, the shape in current model is torch.Size([64, 32, 3, 3]). size mismatch for face_encoder_blocks.3.0.conv_block.0.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for face_encoder_blocks.3.1.conv_block.0.weight: copying a param with shape torch.Size([128, 128, 5, 5]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]). size mismatch for face_encoder_blocks.3.1.conv_block.0.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for face_encoder_blocks.4.0.conv_block.0.weight: copying a param with shape torch.Size([256, 128, 5, 5]) from checkpoint, the shape in current model is torch.Size([128, 64, 3, 3]). size mismatch for face_encoder_blocks.4.0.conv_block.0.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for face_encoder_blocks.4.1.conv_block.0.weight: copying a param with shape torch.Size([256, 256, 5, 5]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]). size mismatch for face_encoder_blocks.4.1.conv_block.0.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for face_encoder_blocks.5.0.conv_block.0.weight: copying a param with shape torch.Size([256, 256, 5, 5]) from checkpoint, the shape in current model is torch.Size([256, 128, 3, 3]). size mismatch for face_encoder_blocks.5.1.conv_block.0.weight: copying a param with shape torch.Size([256, 256, 5, 5]) from checkpoint, the shape in current model is torch.Size([256, 256, 3, 3]). size mismatch for face_encoder_blocks.7.1.conv_block.0.weight: copying a param with shape torch.Size([512, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 512, 1, 1]). 这个问题怎么解决呢

primepake / wav2lip_288x288

inference.py RuntimeError: Error(s) in loading state_dict for Wav2Lip: #61