(venv) F:\work\kouxing>python inference-.py --checkpoint_path SaveModels/disc_checkpoint_step000003000.pth --face input/0710-0000004.mp4 --audio input/100001.wav --static False
Using cuda for inference.
Reading video frames...
Number of frames available for inference: 155
(80, 6692)
Length of mel chunks: 2005
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:03<00:00, 3.83s/it]
Load checkpoint from: SaveModels/disc_checkpoint_step000003000.pth███████████████████████████████████████████████████████| 1/1 [00:03<00:00, 3.83s/it]
0%| | 0/16 [00:04<?, ?it/s]
Traceback (most recent call last):
File "F:\work\kouxing\inference-.py", line 301, in
main()
File "F:\work\kouxing\inference-.py", line 264, in main
model = load_model(args.checkpoint_path)
File "F:\work\kouxing\inference-.py", line 186, in load_model
model.load_state_dict(new_s)
File "F:\work\kouxing\venv\lib\site-packages\torch\nn\modules\module.py", line 2041, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for Wav2Lip:
Missing key(s) in state_dict: "face_encoder_blocks.0.0.conv_block.1.weight", "face_encoder_blocks.0.0.conv_block.1.bias", "face_encoder_blocks.0
.0.conv_block.1.running_mean", "face_encoder_blocks.0.0.conv_block.1.running_var", "face_encoder_blocks.0.0.act.weight", "face_encoder_blocks.1.0.conv_b
lock.1.weight", "face_encoder_blocks.1.0.conv_block.1.bias", "face_encoder_blocks.1.0.conv_block.1.running_mean", "face_encoder_blocks.1.0.conv_block.1.
running_var", "face_encoder_blocks.1.0.act.weight", "face_encoder_blocks.1.1.conv_block.1.weight", "face_encoder_blocks.1.1.conv_block.1.bias", "face_en
coder_blocks.1.1.conv_block.1.running_mean", "face_encoder_blocks.1.1.conv_block.1.running_var", "face_encoder_blocks.1.1.act.weight", "face_encoder_blo
cks.2.0.conv_block.1.weight", "face_encoder_blocks.2.0.conv_block.1.bias", "face_encoder_blocks.2.0.conv_block.1.running_mean", "face_encoder_blocks.2.0
.conv_block.1.running_var", "face_encoder_blocks.2.0.act.weight", "face_encoder_blocks.2.1.conv_block.1.weight", "face_encoder_blocks.2.1.conv_block.1.b
ias", "face_encoder_blocks.2.1.conv_block.1.running_mean", "face_encoder_blocks.2.1.conv_block.1.running_var", "face_encoder_blocks.2.1.act.weight", "fa
ce_encoder_blocks.2.2.conv_block.0.weight", "face_encoder_blocks.2.2.conv_block.0.bias", "face_encoder_blocks.2.2.conv_block.1.weight", "face_encoder_bl
ocks.2.2.conv_block.1.bias", "face_encoder_blocks.2.2.conv_block.1.running_mean", "face_encoder_blocks.2.2.conv_block.1.running_var", "face_encoder_bloc
ks.2.2.act.weight", "face_encoder_blocks.3.0.conv_block.1.weight", "face_encoder_blocks.3.0.conv_block.1.bias", "face_encoder_blocks.3.0.conv_block.1.ru
nning_mean", "face_encoder_blocks.3.0.conv_block.1.running_var", "face_encoder_blocks.3.0.act.weight", "face_encoder_blocks.3.1.conv_block.1.weight", "f
ace_encoder_blocks.3.1.conv_block.1.bias", "face_encoder_blocks.3.1.conv_block.1.running_mean", "face_encoder_blocks.3.1.conv_block.1.running_var", "fac
e_encoder_blocks.3.1.act.weight", "face_encoder_blocks.3.2.conv_block.0.weight", "face_encoder_blocks.3.2.conv_block.0.bias", "face_encoder_blocks.3.2.c
onv_block.1.weight", "face_encoder_blocks.3.2.conv_block.1.bias", "face_encoder_blocks.3.2.conv_block.1.running_mean", "face_encoder_blocks.3.2.conv_blo
ck.1.running_var", "face_encoder_blocks.3.2.act.weight", "face_encoder_blocks.3.3.conv_block.0.weight", "face_encoder_blocks.3.3.conv_block.0.bias", "fa
ce_encoder_blocks.3.3.conv_block.1.weight", "face_encoder_blocks.3.3.conv_block.1.bias", "face_encoder_blocks.3.3.conv_block.1.running_mean", "face_enco
der_blocks.3.3.conv_block.1.running_var", "face_encoder_blocks.3.3.act.weight", "face_encoder_blocks.4.0.conv_block.1.weight", "face_encoder_blocks.4.0.
conv_block.1.bias", "face_encoder_blocks.4.0.conv_block.1.running_mean", "face_encoder_blocks.4.0.conv_block.1.running_var", "face_encoder_blocks.4.0.ac
t.weight", "face_encoder_blocks.4.1.conv_block.1.weight", "face_encoder_blocks.4.1.conv_block.1.bias", "face_encoder_blocks.4.1.conv_block.1.running_mea
n", "face_encoder_blocks.4.1.conv_block.1.running_var", "face_encoder_blocks.4.1.act.weight", "face_encoder_blocks.4.2.conv_block.0.weight", "face_encod
er_blocks.4.2.conv_block.0.bias", "face_encoder_blocks.4.2.conv_block.1.weight", "face_encoder_blocks.4.2.conv_block.1.bias", "face_encoder_blocks.4.2.c
onv_block.1.running_mean", "face_encoder_blocks.4.2.conv_block.1.running_var", "face_encoder_blocks.4.2.act.weight", "face_encoder_blocks.5.0.conv_block
.1.weight", "face_encoder_blocks.5.0.conv_block.1.bias", "face_encoder_blocks.5.0.conv_block.1.running_mean", "face_encoder_blocks.5.0.conv_block.1.runn
ing_var", "face_encoder_blocks.5.0.act.weight", "face_encoder_blocks.5.1.conv_block.1.weight", "face_encoder_blocks.5.1.conv_block.1.bias", "face_encode
r_blocks.5.1.conv_block.1.running_mean", "face_encoder_blocks.5.1.conv_block.1.running_var", "face_encoder_blocks.5.1.act.weight", "face_encoder_blocks.
5.2.conv_block.0.weight", "face_encoder_blocks.5.2.conv_block.0.bias", "face_encoder_blocks.5.2.conv_block.1.weight", "face_encoder_blocks.5.2.conv_bloc
k.1.bias", "face_encoder_blocks.5.2.conv_block.1.running_mean", "face_encoder_blocks.5.2.conv_block.1.running_var", "face_encoder_blocks.5.2.act.weight"
, "face_encoder_blocks.6.0.conv_block.1.weight", "face_encoder_blocks.6.0.conv_block.1.bias", "face_encoder_blocks.6.0.conv_block.1.running_mean", "face
_encoder_blocks.6.0.conv_block.1.running_var", "face_encoder_blocks.6.0.act.weight", "face_encoder_blocks.6.1.conv_block.1.weight", "face_encoder_blocks
.6.1.conv_block.1.bias", "face_encoder_blocks.6.1.conv_block.1.running_mean", "face_encoder_blocks.6.1.conv_block.1.running_var", "face_encoder_blocks.6
.1.act.weight", "face_encoder_blocks.7.0.conv_block.1.weight", "face_encoder_blocks.7.0.conv_block.1.bias", "face_encoder_blocks.7.0.conv_block.1.runnin
g_mean", "face_encoder_blocks.7.0.conv_block.1.running_var", "face_encoder_blocks.7.0.act.weight", "face_encoder_blocks.7.1.convblock.1.weight", "face
encoder_blocks.7.1.conv_block.1.bias", "face_encoder_blocks.7.1.conv_block.1.running_mean", "face_encoder_blocks.7.1.conv_block.1.running_var", "face_en
coder_blocks.7.1.act.weight", "face_encoder_blocks.8.0.conv_block.1.weight", "face_encoder_blocks.8.0.conv_block.1.bias", "face_encoderblocks.8.0.conv
block.1.running_mean", "face_encoder_blocks.8.0.conv_block.1.running_var", "face_encoder_blocks.8.0.act.weight", "face_encoder_blocks.8.1.conv_block.1.w
eight", "face_encoder_blocks.8.1.conv_block.1.bias", "face_encoder_blocks.8.1.conv_block.1.running_mean", "face_encoder_blocks.8.1.convblock.1.running
var", "face_encoder_blocks.8.1.act.weight", "audio_encoder.0.conv_block.0.weight", "audio_encoder.0.conv_block.0.bias", "audio_encoder.0.conv_block.1.we
ight", "audio_encoder.0.conv_block.1.bias", "audio_encoder.0.conv_block.1.running_mean", "audio_encoder.0.conv_block.1.running_var", "audio_encoder.0.ac
t.weight", "audio_encoder.1.conv_block.0.weight", "audio_encoder.1.conv_block.0.bias", "audio_encoder.1.conv_block.1.weight", "audio_encoder.1.conv_bloc
k.1.bias", "audio_encoder.1.conv_block.1.running_mean", "audio_encoder.1.conv_block.1.running_var", "audio_encoder.1.act.weight", "audioencoder.2.conv
block.0.weight", "audio_encoder.2.conv_block.0.bias", "audio_encoder.2.conv_block.1.weight", "audio_encoder.2.conv_block.1.bias", "audioencoder.2.conv
block.1.running_mean", "audio_encoder.2.conv_block.1.running_var", "audio_encoder.2.act.weight", "audio_encoder.3.conv_block.0.weight", "audio_encoder.3
.conv_block.0.bias", "audio_encoder.3.conv_block.1.weight", "audio_encoder.3.conv_block.1.bias", "audio_encoder.3.conv_block.1.running_mean", "audio_enc
oder.3.conv_block.1.running_var", "audio_encoder.3.act.weight", "audio_encoder.4.conv_block.0.weight", "audio_encoder.4.conv_block.0.bias", "audio_encod
er.4.conv_block.1.weight", "audio_encoder.4.conv_block.1.bias", "audio_encoder.4.conv_block.1.running_mean", "audio_encoder.4.conv_block.1.running_var",
"audio_encoder.4.act.weight", "audio_encoder.5.conv_block.0.weight", "audio_encoder.5.conv_block.0.bias", "audio_encoder.5.conv_block.1.weight", "audio
_encoder.5.conv_block.1.bias", "audio_encoder.5.conv_block.1.running_mean", "audio_encoder.5.conv_block.1.running_var", "audio_encoder.5.act.weight", "a
udio_encoder.6.conv_block.0.weight", "audio_encoder.6.conv_block.0.bias", "audio_encoder.6.conv_block.1.weight", "audio_encoder.6.conv_block.1.bias", "a
udio_encoder.6.conv_block.1.running_mean", "audio_encoder.6.conv_block.1.running_var", "audio_encoder.6.act.weight", "audio_encoder.7.conv_block.0.weigh
t", "audio_encoder.7.conv_block.0.bias", "audio_encoder.7.conv_block.1.weight", "audio_encoder.7.conv_block.1.bias", "audio_encoder.7.conv_block.1.runni
ng_mean", "audio_encoder.7.conv_block.1.running_var", "audio_encoder.7.act.weight", "audio_encoder.8.conv_block.0.weight", "audio_encoder.8.conv_block.0
.bias", "audio_encoder.8.conv_block.1.weight", "audio_encoder.8.conv_block.1.bias", "audio_encoder.8.conv_block.1.running_mean", "audio_encoder.8.conv_b
lock.1.running_var", "audio_encoder.8.act.weight", "audio_encoder.9.conv_block.0.weight", "audio_encoder.9.conv_block.0.bias", "audio_encoder.9.conv_blo
ck.1.weight", "audio_encoder.9.conv_block.1.bias", "audio_encoder.9.conv_block.1.running_mean", "audio_encoder.9.conv_block.1.running_var", "audio_encod
er.9.act.weight", "audio_encoder.10.conv_block.0.weight", "audio_encoder.10.conv_block.0.bias", "audio_encoder.10.conv_block.1.weight", "audio_encoder.1
0.conv_block.1.bias", "audio_encoder.10.conv_block.1.running_mean", "audio_encoder.10.conv_block.1.running_var", "audio_encoder.10.act.weight", "audio_e
ncoder.11.conv_block.0.weight", "audio_encoder.11.conv_block.0.bias", "audio_encoder.11.conv_block.1.weight", "audio_encoder.11.conv_block.1.bias", "aud
io_encoder.11.conv_block.1.running_mean", "audio_encoder.11.conv_block.1.running_var", "audio_encoder.11.act.weight", "audio_encoder.12.conv_block.0.wei
ght", "audio_encoder.12.conv_block.0.bias", "audio_encoder.12.conv_block.1.weight", "audio_encoder.12.conv_block.1.bias", "audio_encoder.12.conv_block.1
.running_mean", "audio_encoder.12.conv_block.1.running_var", "audio_encoder.12.act.weight", "face_decoder_blocks.0.0.conv_block.0.weight", "face_decoder
_blocks.0.0.conv_block.0.bias", "face_decoder_blocks.0.0.conv_block.1.weight", "face_decoder_blocks.0.0.conv_block.1.bias", "face_decoder_blocks.0.0.con
v_block.1.running_mean", "face_decoder_blocks.0.0.conv_block.1.running_var", "face_decoder_blocks.0.0.act.weight", "face_decoder_blocks.1.0.conv_block.0
.weight", "face_decoder_blocks.1.0.conv_block.0.bias", "face_decoder_blocks.1.0.conv_block.1.weight", "face_decoder_blocks.1.0.conv_block.1.bias", "face
_decoder_blocks.1.0.conv_block.1.running_mean", "face_decoder_blocks.1.0.conv_block.1.running_var", "face_decoder_blocks.1.0.act.weight", "facedecoder
blocks.1.1.conv_block.0.weight", "face_decoder_blocks.1.1.conv_block.0.bias", "face_decoder_blocks.1.1.conv_block.1.weight", "face_decoder_blocks.1.1.co
nv_block.1.bias", "face_decoder_blocks.1.1.conv_block.1.running_mean", "face_decoder_blocks.1.1.conv_block.1.running_var", "face_decoder_blocks.1.1.act.
weight", "face_decoder_blocks.2.0.conv_block.0.weight", "face_decoder_blocks.2.0.conv_block.0.bias", "face_decoder_blocks.2.0.conv_block.1.weight", "fac
e_decoder_blocks.2.0.conv_block.1.bias", "face_decoder_blocks.2.0.conv_block.1.running_mean", "face_decoder_blocks.2.0.conv_block.1.runningvar", "face
decoder_blocks.2.0.act.weight", "face_decoder_blocks.2.1.conv_block.0.weight", "face_decoder_blocks.2.1.conv_block.0.bias", "face_decoder_blocks.2.1.con
v_block.1.weight", "face_decoder_blocks.2.1.conv_block.1.bias", "face_decoder_blocks.2.1.conv_block.1.running_mean", "face_decoder_blocks.2.1.conv_block
.1.running_var", "face_decoder_blocks.2.1.act.weight", "face_decoder_blocks.2.2.conv_block.0.weight", "face_decoder_blocks.2.2.conv_block.0.bias", "face
_decoder_blocks.2.2.conv_block.1.weight", "face_decoder_blocks.2.2.conv_block.1.bias", "face_decoder_blocks.2.2.conv_block.1.running_mean", "face_decode
r_blocks.2.2.conv_block.1.running_var", "face_decoder_blocks.2.2.act.weight", "face_decoder_blocks.3.0.conv_block.0.weight", "face_decoder_blocks.3.0.co
nv_block.0.bias", "face_decoder_blocks.3.0.conv_block.1.weight", "face_decoder_blocks.3.0.conv_block.1.bias", "face_decoder_blocks.3.0.conv_block.1.runn
ing_mean", "face_decoder_blocks.3.0.conv_block.1.running_var", "face_decoder_blocks.3.0.act.weight", "face_decoder_blocks.3.1.conv_block.0.weight", "fac
e_decoder_blocks.3.1.conv_block.0.bias", "face_decoder_blocks.3.1.conv_block.1.weight", "face_decoder_blocks.3.1.conv_block.1.bias", "face_decoder_block
s.3.1.conv_block.1.running_mean", "face_decoder_blocks.3.1.conv_block.1.running_var", "face_decoder_blocks.3.1.act.weight", "face_decoder_blocks.3.2.con
v_block.0.weight", "face_decoder_blocks.3.2.conv_block.0.bias", "face_decoder_blocks.3.2.conv_block.1.weight", "face_decoder_blocks.3.2.conv_block.1.bia
s", "face_decoder_blocks.3.2.conv_block.1.running_mean", "face_decoder_blocks.3.2.conv_block.1.running_var", "face_decoder_blocks.3.2.act.weight", "face
_decoder_blocks.4.0.conv_block.0.weight", "face_decoder_blocks.4.0.conv_block.0.bias", "face_decoder_blocks.4.0.conv_block.1.weight", "face_decoder_bloc
ks.4.0.conv_block.1.bias", "face_decoder_blocks.4.0.conv_block.1.running_mean", "face_decoder_blocks.4.0.conv_block.1.running_var", "face_decoder_blocks
.4.0.act.weight", "face_decoder_blocks.4.1.conv_block.0.weight", "face_decoder_blocks.4.1.conv_block.0.bias", "face_decoder_blocks.4.1.conv_block.1.weig
ht", "face_decoder_blocks.4.1.conv_block.1.bias", "face_decoder_blocks.4.1.conv_block.1.running_mean", "face_decoder_blocks.4.1.conv_block.1.running_var
", "face_decoder_blocks.4.1.act.weight", "face_decoder_blocks.4.2.conv_block.0.weight", "face_decoder_blocks.4.2.conv_block.0.bias", "face_decoder_block
s.4.2.conv_block.1.weight", "face_decoder_blocks.4.2.conv_block.1.bias", "face_decoder_blocks.4.2.conv_block.1.running_mean", "face_decoder_blocks.4.2.c
onv_block.1.running_var", "face_decoder_blocks.4.2.act.weight", "face_decoder_blocks.5.0.conv_block.0.weight", "face_decoder_blocks.5.0.conv_block.0.bia
s", "face_decoder_blocks.5.0.conv_block.1.weight", "face_decoder_blocks.5.0.conv_block.1.bias", "face_decoder_blocks.5.0.conv_block.1.running_mean", "fa
ce_decoder_blocks.5.0.conv_block.1.running_var", "face_decoder_blocks.5.0.act.weight", "face_decoder_blocks.5.1.conv_block.0.weight", "face_decoder_bloc
ks.5.1.conv_block.0.bias", "face_decoder_blocks.5.1.conv_block.1.weight", "face_decoder_blocks.5.1.conv_block.1.bias", "face_decoder_blocks.5.1.conv_blo
ck.1.running_mean", "face_decoder_blocks.5.1.conv_block.1.running_var", "face_decoder_blocks.5.1.act.weight", "face_decoder_blocks.5.2.conv_block.0.weig
ht", "face_decoder_blocks.5.2.conv_block.0.bias", "face_decoder_blocks.5.2.conv_block.1.weight", "face_decoder_blocks.5.2.conv_block.1.bias", "face_deco
der_blocks.5.2.conv_block.1.running_mean", "face_decoder_blocks.5.2.conv_block.1.running_var", "face_decoder_blocks.5.2.act.weight", "face_decoder_block
s.6.0.conv_block.0.weight", "face_decoder_blocks.6.0.conv_block.0.bias", "face_decoder_blocks.6.0.conv_block.1.weight", "face_decoder_blocks.6.0.conv_bl
ock.1.bias", "face_decoder_blocks.6.0.conv_block.1.running_mean", "face_decoder_blocks.6.0.conv_block.1.running_var", "face_decoder_blocks.6.0.act.weigh
t", "face_decoder_blocks.6.1.conv_block.0.weight", "face_decoder_blocks.6.1.conv_block.0.bias", "face_decoder_blocks.6.1.conv_block.1.weight", "face_dec
oder_blocks.6.1.conv_block.1.bias", "face_decoder_blocks.6.1.conv_block.1.running_mean", "face_decoder_blocks.6.1.conv_block.1.running_var", "face_decod
er_blocks.6.1.act.weight", "face_decoder_blocks.6.2.conv_block.0.weight", "face_decoder_blocks.6.2.conv_block.0.bias", "face_decoder_blocks.6.2.conv_blo
ck.1.weight", "face_decoder_blocks.6.2.conv_block.1.bias", "face_decoder_blocks.6.2.conv_block.1.running_mean", "face_decoder_blocks.6.2.conv_block.1.ru
nning_var", "face_decoder_blocks.6.2.act.weight", "face_decoder_blocks.7.0.conv_block.0.weight", "face_decoder_blocks.7.0.conv_block.0.bias", "face_deco
der_blocks.7.0.conv_block.1.weight", "face_decoder_blocks.7.0.conv_block.1.bias", "face_decoder_blocks.7.0.conv_block.1.running_mean", "face_decoder_blo
cks.7.0.conv_block.1.running_var", "face_decoder_blocks.7.0.act.weight", "face_decoder_blocks.7.1.conv_block.0.weight", "face_decoder_blocks.7.1.conv_bl
ock.0.bias", "face_decoder_blocks.7.1.conv_block.1.weight", "face_decoder_blocks.7.1.conv_block.1.bias", "face_decoder_blocks.7.1.conv_block.1.running_m
ean", "face_decoder_blocks.7.1.conv_block.1.running_var", "face_decoder_blocks.7.1.act.weight", "face_decoder_blocks.7.2.conv_block.0.weight", "face_dec
oder_blocks.7.2.conv_block.0.bias", "face_decoder_blocks.7.2.conv_block.1.weight", "face_decoder_blocks.7.2.conv_block.1.bias", "face_decoder_blocks.7.2
.conv_block.1.running_mean", "face_decoder_blocks.7.2.conv_block.1.running_var", "face_decoder_blocks.7.2.act.weight", "face_decoder_blocks.8.0.conv_blo
ck.0.weight", "face_decoder_blocks.8.0.conv_block.0.bias", "face_decoder_blocks.8.0.conv_block.1.weight", "face_decoder_blocks.8.0.conv_block.1.bias", "
face_decoder_blocks.8.0.conv_block.1.running_mean", "face_decoder_blocks.8.0.conv_block.1.running_var", "face_decoder_blocks.8.0.act.weight", "face_deco
der_blocks.8.1.conv_block.0.weight", "face_decoder_blocks.8.1.conv_block.0.bias", "face_decoder_blocks.8.1.conv_block.1.weight", "face_decoder_blocks.8.
1.conv_block.1.bias", "face_decoder_blocks.8.1.conv_block.1.running_mean", "face_decoder_blocks.8.1.conv_block.1.running_var", "face_decoder_blocks.8.1.
act.weight", "face_decoder_blocks.8.2.conv_block.0.weight", "face_decoder_blocks.8.2.conv_block.0.bias", "face_decoder_blocks.8.2.conv_block.1.weight",
"face_decoder_blocks.8.2.conv_block.1.bias", "face_decoder_blocks.8.2.conv_block.1.running_mean", "face_decoder_blocks.8.2.conv_block.1.running_var", "f
ace_decoder_blocks.8.2.act.weight", "output_block.0.conv_block.0.weight", "output_block.0.conv_block.0.bias", "output_block.0.conv_block.1.weight", "out
put_block.0.conv_block.1.bias", "output_block.0.conv_block.1.running_mean", "output_block.0.conv_block.1.running_var", "output_block.0.act.weight", "output_block.1.weight", "output_block.1.bias".
Unexpected key(s) in state_dict: "binary_pred.0.weight", "binary_pred.0.bias".
size mismatch for face_encoder_blocks.0.0.conv_block.0.weight: copying a param with shape torch.Size([32, 3, 7, 7]) from checkpoint, the shape in current model is torch.Size([16, 6, 7, 7]).
size mismatch for face_encoder_blocks.0.0.conv_block.0.bias: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for face_encoder_blocks.1.0.conv_block.0.weight: copying a param with shape torch.Size([64, 32, 5, 5]) from checkpoint, the shape in current model is torch.Size([32, 16, 5, 5]).
size mismatch for face_encoder_blocks.1.0.conv_block.0.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for face_encoder_blocks.1.1.conv_block.0.weight: copying a param with shape torch.Size([64, 64, 5, 5]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for face_encoder_blocks.1.1.conv_block.0.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for face_encoder_blocks.2.0.conv_block.0.weight: copying a param with shape torch.Size([128, 64, 5, 5]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for face_encoder_blocks.2.0.conv_block.0.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for face_encoder_blocks.2.1.conv_block.0.weight: copying a param with shape torch.Size([128, 128, 5, 5]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for face_encoder_blocks.2.1.conv_block.0.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for face_encoder_blocks.3.0.conv_block.0.weight: copying a param with shape torch.Size([128, 128, 5, 5]) from checkpoint, the shape in current model is torch.Size([64, 32, 3, 3]).
size mismatch for face_encoder_blocks.3.0.conv_block.0.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for face_encoder_blocks.3.1.conv_block.0.weight: copying a param with shape torch.Size([128, 128, 5, 5]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for face_encoder_blocks.3.1.conv_block.0.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for face_encoder_blocks.4.0.conv_block.0.weight: copying a param with shape torch.Size([256, 128, 5, 5]) from checkpoint, the shape in current model is torch.Size([128, 64, 3, 3]).
size mismatch for face_encoder_blocks.4.0.conv_block.0.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for face_encoder_blocks.4.1.conv_block.0.weight: copying a param with shape torch.Size([256, 256, 5, 5]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]).
size mismatch for face_encoder_blocks.4.1.conv_block.0.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for face_encoder_blocks.5.0.conv_block.0.weight: copying a param with shape torch.Size([256, 256, 5, 5]) from checkpoint, the shape in current model is torch.Size([256, 128, 3, 3]).
size mismatch for face_encoder_blocks.5.1.conv_block.0.weight: copying a param with shape torch.Size([256, 256, 5, 5]) from checkpoint, the shape in current model is torch.Size([256, 256, 3, 3]).
size mismatch for face_encoder_blocks.7.1.conv_block.0.weight: copying a param with shape torch.Size([512, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 512, 1, 1]).
这个问题怎么解决呢
(venv) F:\work\kouxing>python inference-.py --checkpoint_path SaveModels/disc_checkpoint_step000003000.pth --face input/0710-0000004.mp4 --audio input/100001.wav --static False Using cuda for inference. Reading video frames... Number of frames available for inference: 155 (80, 6692) Length of mel chunks: 2005 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:03<00:00, 3.83s/it] Load checkpoint from: SaveModels/disc_checkpoint_step000003000.pth███████████████████████████████████████████████████████| 1/1 [00:03<00:00, 3.83s/it] 0%| | 0/16 [00:04<?, ?it/s] Traceback (most recent call last): File "F:\work\kouxing\inference-.py", line 301, in
main()
File "F:\work\kouxing\inference-.py", line 264, in main
model = load_model(args.checkpoint_path)
File "F:\work\kouxing\inference-.py", line 186, in load_model
model.load_state_dict(new_s)
File "F:\work\kouxing\venv\lib\site-packages\torch\nn\modules\module.py", line 2041, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for Wav2Lip:
Missing key(s) in state_dict: "face_encoder_blocks.0.0.conv_block.1.weight", "face_encoder_blocks.0.0.conv_block.1.bias", "face_encoder_blocks.0
.0.conv_block.1.running_mean", "face_encoder_blocks.0.0.conv_block.1.running_var", "face_encoder_blocks.0.0.act.weight", "face_encoder_blocks.1.0.conv_b
lock.1.weight", "face_encoder_blocks.1.0.conv_block.1.bias", "face_encoder_blocks.1.0.conv_block.1.running_mean", "face_encoder_blocks.1.0.conv_block.1.
running_var", "face_encoder_blocks.1.0.act.weight", "face_encoder_blocks.1.1.conv_block.1.weight", "face_encoder_blocks.1.1.conv_block.1.bias", "face_en
coder_blocks.1.1.conv_block.1.running_mean", "face_encoder_blocks.1.1.conv_block.1.running_var", "face_encoder_blocks.1.1.act.weight", "face_encoder_blo
cks.2.0.conv_block.1.weight", "face_encoder_blocks.2.0.conv_block.1.bias", "face_encoder_blocks.2.0.conv_block.1.running_mean", "face_encoder_blocks.2.0
.conv_block.1.running_var", "face_encoder_blocks.2.0.act.weight", "face_encoder_blocks.2.1.conv_block.1.weight", "face_encoder_blocks.2.1.conv_block.1.b
ias", "face_encoder_blocks.2.1.conv_block.1.running_mean", "face_encoder_blocks.2.1.conv_block.1.running_var", "face_encoder_blocks.2.1.act.weight", "fa
ce_encoder_blocks.2.2.conv_block.0.weight", "face_encoder_blocks.2.2.conv_block.0.bias", "face_encoder_blocks.2.2.conv_block.1.weight", "face_encoder_bl
ocks.2.2.conv_block.1.bias", "face_encoder_blocks.2.2.conv_block.1.running_mean", "face_encoder_blocks.2.2.conv_block.1.running_var", "face_encoder_bloc
ks.2.2.act.weight", "face_encoder_blocks.3.0.conv_block.1.weight", "face_encoder_blocks.3.0.conv_block.1.bias", "face_encoder_blocks.3.0.conv_block.1.ru
nning_mean", "face_encoder_blocks.3.0.conv_block.1.running_var", "face_encoder_blocks.3.0.act.weight", "face_encoder_blocks.3.1.conv_block.1.weight", "f
ace_encoder_blocks.3.1.conv_block.1.bias", "face_encoder_blocks.3.1.conv_block.1.running_mean", "face_encoder_blocks.3.1.conv_block.1.running_var", "fac
e_encoder_blocks.3.1.act.weight", "face_encoder_blocks.3.2.conv_block.0.weight", "face_encoder_blocks.3.2.conv_block.0.bias", "face_encoder_blocks.3.2.c
onv_block.1.weight", "face_encoder_blocks.3.2.conv_block.1.bias", "face_encoder_blocks.3.2.conv_block.1.running_mean", "face_encoder_blocks.3.2.conv_blo
ck.1.running_var", "face_encoder_blocks.3.2.act.weight", "face_encoder_blocks.3.3.conv_block.0.weight", "face_encoder_blocks.3.3.conv_block.0.bias", "fa
ce_encoder_blocks.3.3.conv_block.1.weight", "face_encoder_blocks.3.3.conv_block.1.bias", "face_encoder_blocks.3.3.conv_block.1.running_mean", "face_enco
der_blocks.3.3.conv_block.1.running_var", "face_encoder_blocks.3.3.act.weight", "face_encoder_blocks.4.0.conv_block.1.weight", "face_encoder_blocks.4.0.
conv_block.1.bias", "face_encoder_blocks.4.0.conv_block.1.running_mean", "face_encoder_blocks.4.0.conv_block.1.running_var", "face_encoder_blocks.4.0.ac
t.weight", "face_encoder_blocks.4.1.conv_block.1.weight", "face_encoder_blocks.4.1.conv_block.1.bias", "face_encoder_blocks.4.1.conv_block.1.running_mea
n", "face_encoder_blocks.4.1.conv_block.1.running_var", "face_encoder_blocks.4.1.act.weight", "face_encoder_blocks.4.2.conv_block.0.weight", "face_encod
er_blocks.4.2.conv_block.0.bias", "face_encoder_blocks.4.2.conv_block.1.weight", "face_encoder_blocks.4.2.conv_block.1.bias", "face_encoder_blocks.4.2.c
onv_block.1.running_mean", "face_encoder_blocks.4.2.conv_block.1.running_var", "face_encoder_blocks.4.2.act.weight", "face_encoder_blocks.5.0.conv_block
.1.weight", "face_encoder_blocks.5.0.conv_block.1.bias", "face_encoder_blocks.5.0.conv_block.1.running_mean", "face_encoder_blocks.5.0.conv_block.1.runn
ing_var", "face_encoder_blocks.5.0.act.weight", "face_encoder_blocks.5.1.conv_block.1.weight", "face_encoder_blocks.5.1.conv_block.1.bias", "face_encode
r_blocks.5.1.conv_block.1.running_mean", "face_encoder_blocks.5.1.conv_block.1.running_var", "face_encoder_blocks.5.1.act.weight", "face_encoder_blocks.
5.2.conv_block.0.weight", "face_encoder_blocks.5.2.conv_block.0.bias", "face_encoder_blocks.5.2.conv_block.1.weight", "face_encoder_blocks.5.2.conv_bloc
k.1.bias", "face_encoder_blocks.5.2.conv_block.1.running_mean", "face_encoder_blocks.5.2.conv_block.1.running_var", "face_encoder_blocks.5.2.act.weight"
, "face_encoder_blocks.6.0.conv_block.1.weight", "face_encoder_blocks.6.0.conv_block.1.bias", "face_encoder_blocks.6.0.conv_block.1.running_mean", "face
_encoder_blocks.6.0.conv_block.1.running_var", "face_encoder_blocks.6.0.act.weight", "face_encoder_blocks.6.1.conv_block.1.weight", "face_encoder_blocks
.6.1.conv_block.1.bias", "face_encoder_blocks.6.1.conv_block.1.running_mean", "face_encoder_blocks.6.1.conv_block.1.running_var", "face_encoder_blocks.6
.1.act.weight", "face_encoder_blocks.7.0.conv_block.1.weight", "face_encoder_blocks.7.0.conv_block.1.bias", "face_encoder_blocks.7.0.conv_block.1.runnin
g_mean", "face_encoder_blocks.7.0.conv_block.1.running_var", "face_encoder_blocks.7.0.act.weight", "face_encoder_blocks.7.1.convblock.1.weight", "face
encoder_blocks.7.1.conv_block.1.bias", "face_encoder_blocks.7.1.conv_block.1.running_mean", "face_encoder_blocks.7.1.conv_block.1.running_var", "face_en
coder_blocks.7.1.act.weight", "face_encoder_blocks.8.0.conv_block.1.weight", "face_encoder_blocks.8.0.conv_block.1.bias", "face_encoderblocks.8.0.conv
block.1.running_mean", "face_encoder_blocks.8.0.conv_block.1.running_var", "face_encoder_blocks.8.0.act.weight", "face_encoder_blocks.8.1.conv_block.1.w
eight", "face_encoder_blocks.8.1.conv_block.1.bias", "face_encoder_blocks.8.1.conv_block.1.running_mean", "face_encoder_blocks.8.1.convblock.1.running
var", "face_encoder_blocks.8.1.act.weight", "audio_encoder.0.conv_block.0.weight", "audio_encoder.0.conv_block.0.bias", "audio_encoder.0.conv_block.1.we
ight", "audio_encoder.0.conv_block.1.bias", "audio_encoder.0.conv_block.1.running_mean", "audio_encoder.0.conv_block.1.running_var", "audio_encoder.0.ac
t.weight", "audio_encoder.1.conv_block.0.weight", "audio_encoder.1.conv_block.0.bias", "audio_encoder.1.conv_block.1.weight", "audio_encoder.1.conv_bloc
k.1.bias", "audio_encoder.1.conv_block.1.running_mean", "audio_encoder.1.conv_block.1.running_var", "audio_encoder.1.act.weight", "audioencoder.2.conv
block.0.weight", "audio_encoder.2.conv_block.0.bias", "audio_encoder.2.conv_block.1.weight", "audio_encoder.2.conv_block.1.bias", "audioencoder.2.conv
block.1.running_mean", "audio_encoder.2.conv_block.1.running_var", "audio_encoder.2.act.weight", "audio_encoder.3.conv_block.0.weight", "audio_encoder.3
.conv_block.0.bias", "audio_encoder.3.conv_block.1.weight", "audio_encoder.3.conv_block.1.bias", "audio_encoder.3.conv_block.1.running_mean", "audio_enc
oder.3.conv_block.1.running_var", "audio_encoder.3.act.weight", "audio_encoder.4.conv_block.0.weight", "audio_encoder.4.conv_block.0.bias", "audio_encod
er.4.conv_block.1.weight", "audio_encoder.4.conv_block.1.bias", "audio_encoder.4.conv_block.1.running_mean", "audio_encoder.4.conv_block.1.running_var",
"audio_encoder.4.act.weight", "audio_encoder.5.conv_block.0.weight", "audio_encoder.5.conv_block.0.bias", "audio_encoder.5.conv_block.1.weight", "audio
_encoder.5.conv_block.1.bias", "audio_encoder.5.conv_block.1.running_mean", "audio_encoder.5.conv_block.1.running_var", "audio_encoder.5.act.weight", "a
udio_encoder.6.conv_block.0.weight", "audio_encoder.6.conv_block.0.bias", "audio_encoder.6.conv_block.1.weight", "audio_encoder.6.conv_block.1.bias", "a
udio_encoder.6.conv_block.1.running_mean", "audio_encoder.6.conv_block.1.running_var", "audio_encoder.6.act.weight", "audio_encoder.7.conv_block.0.weigh
t", "audio_encoder.7.conv_block.0.bias", "audio_encoder.7.conv_block.1.weight", "audio_encoder.7.conv_block.1.bias", "audio_encoder.7.conv_block.1.runni
ng_mean", "audio_encoder.7.conv_block.1.running_var", "audio_encoder.7.act.weight", "audio_encoder.8.conv_block.0.weight", "audio_encoder.8.conv_block.0
.bias", "audio_encoder.8.conv_block.1.weight", "audio_encoder.8.conv_block.1.bias", "audio_encoder.8.conv_block.1.running_mean", "audio_encoder.8.conv_b
lock.1.running_var", "audio_encoder.8.act.weight", "audio_encoder.9.conv_block.0.weight", "audio_encoder.9.conv_block.0.bias", "audio_encoder.9.conv_blo
ck.1.weight", "audio_encoder.9.conv_block.1.bias", "audio_encoder.9.conv_block.1.running_mean", "audio_encoder.9.conv_block.1.running_var", "audio_encod
er.9.act.weight", "audio_encoder.10.conv_block.0.weight", "audio_encoder.10.conv_block.0.bias", "audio_encoder.10.conv_block.1.weight", "audio_encoder.1
0.conv_block.1.bias", "audio_encoder.10.conv_block.1.running_mean", "audio_encoder.10.conv_block.1.running_var", "audio_encoder.10.act.weight", "audio_e
ncoder.11.conv_block.0.weight", "audio_encoder.11.conv_block.0.bias", "audio_encoder.11.conv_block.1.weight", "audio_encoder.11.conv_block.1.bias", "aud
io_encoder.11.conv_block.1.running_mean", "audio_encoder.11.conv_block.1.running_var", "audio_encoder.11.act.weight", "audio_encoder.12.conv_block.0.wei
ght", "audio_encoder.12.conv_block.0.bias", "audio_encoder.12.conv_block.1.weight", "audio_encoder.12.conv_block.1.bias", "audio_encoder.12.conv_block.1
.running_mean", "audio_encoder.12.conv_block.1.running_var", "audio_encoder.12.act.weight", "face_decoder_blocks.0.0.conv_block.0.weight", "face_decoder
_blocks.0.0.conv_block.0.bias", "face_decoder_blocks.0.0.conv_block.1.weight", "face_decoder_blocks.0.0.conv_block.1.bias", "face_decoder_blocks.0.0.con
v_block.1.running_mean", "face_decoder_blocks.0.0.conv_block.1.running_var", "face_decoder_blocks.0.0.act.weight", "face_decoder_blocks.1.0.conv_block.0
.weight", "face_decoder_blocks.1.0.conv_block.0.bias", "face_decoder_blocks.1.0.conv_block.1.weight", "face_decoder_blocks.1.0.conv_block.1.bias", "face
_decoder_blocks.1.0.conv_block.1.running_mean", "face_decoder_blocks.1.0.conv_block.1.running_var", "face_decoder_blocks.1.0.act.weight", "facedecoder
blocks.1.1.conv_block.0.weight", "face_decoder_blocks.1.1.conv_block.0.bias", "face_decoder_blocks.1.1.conv_block.1.weight", "face_decoder_blocks.1.1.co
nv_block.1.bias", "face_decoder_blocks.1.1.conv_block.1.running_mean", "face_decoder_blocks.1.1.conv_block.1.running_var", "face_decoder_blocks.1.1.act.
weight", "face_decoder_blocks.2.0.conv_block.0.weight", "face_decoder_blocks.2.0.conv_block.0.bias", "face_decoder_blocks.2.0.conv_block.1.weight", "fac
e_decoder_blocks.2.0.conv_block.1.bias", "face_decoder_blocks.2.0.conv_block.1.running_mean", "face_decoder_blocks.2.0.conv_block.1.runningvar", "face
decoder_blocks.2.0.act.weight", "face_decoder_blocks.2.1.conv_block.0.weight", "face_decoder_blocks.2.1.conv_block.0.bias", "face_decoder_blocks.2.1.con
v_block.1.weight", "face_decoder_blocks.2.1.conv_block.1.bias", "face_decoder_blocks.2.1.conv_block.1.running_mean", "face_decoder_blocks.2.1.conv_block
.1.running_var", "face_decoder_blocks.2.1.act.weight", "face_decoder_blocks.2.2.conv_block.0.weight", "face_decoder_blocks.2.2.conv_block.0.bias", "face
_decoder_blocks.2.2.conv_block.1.weight", "face_decoder_blocks.2.2.conv_block.1.bias", "face_decoder_blocks.2.2.conv_block.1.running_mean", "face_decode
r_blocks.2.2.conv_block.1.running_var", "face_decoder_blocks.2.2.act.weight", "face_decoder_blocks.3.0.conv_block.0.weight", "face_decoder_blocks.3.0.co
nv_block.0.bias", "face_decoder_blocks.3.0.conv_block.1.weight", "face_decoder_blocks.3.0.conv_block.1.bias", "face_decoder_blocks.3.0.conv_block.1.runn
ing_mean", "face_decoder_blocks.3.0.conv_block.1.running_var", "face_decoder_blocks.3.0.act.weight", "face_decoder_blocks.3.1.conv_block.0.weight", "fac
e_decoder_blocks.3.1.conv_block.0.bias", "face_decoder_blocks.3.1.conv_block.1.weight", "face_decoder_blocks.3.1.conv_block.1.bias", "face_decoder_block
s.3.1.conv_block.1.running_mean", "face_decoder_blocks.3.1.conv_block.1.running_var", "face_decoder_blocks.3.1.act.weight", "face_decoder_blocks.3.2.con
v_block.0.weight", "face_decoder_blocks.3.2.conv_block.0.bias", "face_decoder_blocks.3.2.conv_block.1.weight", "face_decoder_blocks.3.2.conv_block.1.bia
s", "face_decoder_blocks.3.2.conv_block.1.running_mean", "face_decoder_blocks.3.2.conv_block.1.running_var", "face_decoder_blocks.3.2.act.weight", "face
_decoder_blocks.4.0.conv_block.0.weight", "face_decoder_blocks.4.0.conv_block.0.bias", "face_decoder_blocks.4.0.conv_block.1.weight", "face_decoder_bloc
ks.4.0.conv_block.1.bias", "face_decoder_blocks.4.0.conv_block.1.running_mean", "face_decoder_blocks.4.0.conv_block.1.running_var", "face_decoder_blocks
.4.0.act.weight", "face_decoder_blocks.4.1.conv_block.0.weight", "face_decoder_blocks.4.1.conv_block.0.bias", "face_decoder_blocks.4.1.conv_block.1.weig
ht", "face_decoder_blocks.4.1.conv_block.1.bias", "face_decoder_blocks.4.1.conv_block.1.running_mean", "face_decoder_blocks.4.1.conv_block.1.running_var
", "face_decoder_blocks.4.1.act.weight", "face_decoder_blocks.4.2.conv_block.0.weight", "face_decoder_blocks.4.2.conv_block.0.bias", "face_decoder_block
s.4.2.conv_block.1.weight", "face_decoder_blocks.4.2.conv_block.1.bias", "face_decoder_blocks.4.2.conv_block.1.running_mean", "face_decoder_blocks.4.2.c
onv_block.1.running_var", "face_decoder_blocks.4.2.act.weight", "face_decoder_blocks.5.0.conv_block.0.weight", "face_decoder_blocks.5.0.conv_block.0.bia
s", "face_decoder_blocks.5.0.conv_block.1.weight", "face_decoder_blocks.5.0.conv_block.1.bias", "face_decoder_blocks.5.0.conv_block.1.running_mean", "fa
ce_decoder_blocks.5.0.conv_block.1.running_var", "face_decoder_blocks.5.0.act.weight", "face_decoder_blocks.5.1.conv_block.0.weight", "face_decoder_bloc
ks.5.1.conv_block.0.bias", "face_decoder_blocks.5.1.conv_block.1.weight", "face_decoder_blocks.5.1.conv_block.1.bias", "face_decoder_blocks.5.1.conv_blo
ck.1.running_mean", "face_decoder_blocks.5.1.conv_block.1.running_var", "face_decoder_blocks.5.1.act.weight", "face_decoder_blocks.5.2.conv_block.0.weig
ht", "face_decoder_blocks.5.2.conv_block.0.bias", "face_decoder_blocks.5.2.conv_block.1.weight", "face_decoder_blocks.5.2.conv_block.1.bias", "face_deco
der_blocks.5.2.conv_block.1.running_mean", "face_decoder_blocks.5.2.conv_block.1.running_var", "face_decoder_blocks.5.2.act.weight", "face_decoder_block
s.6.0.conv_block.0.weight", "face_decoder_blocks.6.0.conv_block.0.bias", "face_decoder_blocks.6.0.conv_block.1.weight", "face_decoder_blocks.6.0.conv_bl
ock.1.bias", "face_decoder_blocks.6.0.conv_block.1.running_mean", "face_decoder_blocks.6.0.conv_block.1.running_var", "face_decoder_blocks.6.0.act.weigh
t", "face_decoder_blocks.6.1.conv_block.0.weight", "face_decoder_blocks.6.1.conv_block.0.bias", "face_decoder_blocks.6.1.conv_block.1.weight", "face_dec
oder_blocks.6.1.conv_block.1.bias", "face_decoder_blocks.6.1.conv_block.1.running_mean", "face_decoder_blocks.6.1.conv_block.1.running_var", "face_decod
er_blocks.6.1.act.weight", "face_decoder_blocks.6.2.conv_block.0.weight", "face_decoder_blocks.6.2.conv_block.0.bias", "face_decoder_blocks.6.2.conv_blo
ck.1.weight", "face_decoder_blocks.6.2.conv_block.1.bias", "face_decoder_blocks.6.2.conv_block.1.running_mean", "face_decoder_blocks.6.2.conv_block.1.ru
nning_var", "face_decoder_blocks.6.2.act.weight", "face_decoder_blocks.7.0.conv_block.0.weight", "face_decoder_blocks.7.0.conv_block.0.bias", "face_deco
der_blocks.7.0.conv_block.1.weight", "face_decoder_blocks.7.0.conv_block.1.bias", "face_decoder_blocks.7.0.conv_block.1.running_mean", "face_decoder_blo
cks.7.0.conv_block.1.running_var", "face_decoder_blocks.7.0.act.weight", "face_decoder_blocks.7.1.conv_block.0.weight", "face_decoder_blocks.7.1.conv_bl
ock.0.bias", "face_decoder_blocks.7.1.conv_block.1.weight", "face_decoder_blocks.7.1.conv_block.1.bias", "face_decoder_blocks.7.1.conv_block.1.running_m
ean", "face_decoder_blocks.7.1.conv_block.1.running_var", "face_decoder_blocks.7.1.act.weight", "face_decoder_blocks.7.2.conv_block.0.weight", "face_dec
oder_blocks.7.2.conv_block.0.bias", "face_decoder_blocks.7.2.conv_block.1.weight", "face_decoder_blocks.7.2.conv_block.1.bias", "face_decoder_blocks.7.2
.conv_block.1.running_mean", "face_decoder_blocks.7.2.conv_block.1.running_var", "face_decoder_blocks.7.2.act.weight", "face_decoder_blocks.8.0.conv_blo
ck.0.weight", "face_decoder_blocks.8.0.conv_block.0.bias", "face_decoder_blocks.8.0.conv_block.1.weight", "face_decoder_blocks.8.0.conv_block.1.bias", "
face_decoder_blocks.8.0.conv_block.1.running_mean", "face_decoder_blocks.8.0.conv_block.1.running_var", "face_decoder_blocks.8.0.act.weight", "face_deco
der_blocks.8.1.conv_block.0.weight", "face_decoder_blocks.8.1.conv_block.0.bias", "face_decoder_blocks.8.1.conv_block.1.weight", "face_decoder_blocks.8.
1.conv_block.1.bias", "face_decoder_blocks.8.1.conv_block.1.running_mean", "face_decoder_blocks.8.1.conv_block.1.running_var", "face_decoder_blocks.8.1.
act.weight", "face_decoder_blocks.8.2.conv_block.0.weight", "face_decoder_blocks.8.2.conv_block.0.bias", "face_decoder_blocks.8.2.conv_block.1.weight",
"face_decoder_blocks.8.2.conv_block.1.bias", "face_decoder_blocks.8.2.conv_block.1.running_mean", "face_decoder_blocks.8.2.conv_block.1.running_var", "f
ace_decoder_blocks.8.2.act.weight", "output_block.0.conv_block.0.weight", "output_block.0.conv_block.0.bias", "output_block.0.conv_block.1.weight", "out
put_block.0.conv_block.1.bias", "output_block.0.conv_block.1.running_mean", "output_block.0.conv_block.1.running_var", "output_block.0.act.weight", "output_block.1.weight", "output_block.1.bias".
Unexpected key(s) in state_dict: "binary_pred.0.weight", "binary_pred.0.bias".
size mismatch for face_encoder_blocks.0.0.conv_block.0.weight: copying a param with shape torch.Size([32, 3, 7, 7]) from checkpoint, the shape in current model is torch.Size([16, 6, 7, 7]).
size mismatch for face_encoder_blocks.0.0.conv_block.0.bias: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for face_encoder_blocks.1.0.conv_block.0.weight: copying a param with shape torch.Size([64, 32, 5, 5]) from checkpoint, the shape in current model is torch.Size([32, 16, 5, 5]).
size mismatch for face_encoder_blocks.1.0.conv_block.0.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for face_encoder_blocks.1.1.conv_block.0.weight: copying a param with shape torch.Size([64, 64, 5, 5]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for face_encoder_blocks.1.1.conv_block.0.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for face_encoder_blocks.2.0.conv_block.0.weight: copying a param with shape torch.Size([128, 64, 5, 5]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for face_encoder_blocks.2.0.conv_block.0.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for face_encoder_blocks.2.1.conv_block.0.weight: copying a param with shape torch.Size([128, 128, 5, 5]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for face_encoder_blocks.2.1.conv_block.0.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for face_encoder_blocks.3.0.conv_block.0.weight: copying a param with shape torch.Size([128, 128, 5, 5]) from checkpoint, the shape in current model is torch.Size([64, 32, 3, 3]).
size mismatch for face_encoder_blocks.3.0.conv_block.0.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for face_encoder_blocks.3.1.conv_block.0.weight: copying a param with shape torch.Size([128, 128, 5, 5]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for face_encoder_blocks.3.1.conv_block.0.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for face_encoder_blocks.4.0.conv_block.0.weight: copying a param with shape torch.Size([256, 128, 5, 5]) from checkpoint, the shape in current model is torch.Size([128, 64, 3, 3]).
size mismatch for face_encoder_blocks.4.0.conv_block.0.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for face_encoder_blocks.4.1.conv_block.0.weight: copying a param with shape torch.Size([256, 256, 5, 5]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]).
size mismatch for face_encoder_blocks.4.1.conv_block.0.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for face_encoder_blocks.5.0.conv_block.0.weight: copying a param with shape torch.Size([256, 256, 5, 5]) from checkpoint, the shape in current model is torch.Size([256, 128, 3, 3]).
size mismatch for face_encoder_blocks.5.1.conv_block.0.weight: copying a param with shape torch.Size([256, 256, 5, 5]) from checkpoint, the shape in current model is torch.Size([256, 256, 3, 3]).
size mismatch for face_encoder_blocks.7.1.conv_block.0.weight: copying a param with shape torch.Size([512, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 512, 1, 1]).
这个问题怎么解决呢