RuntimeError: Error(s) in loading state_dict for CLIP:
Missing key(s) in state_dict: "visual.transformer.resblocks.32.ln_1.weight", "visual.transformer.resblocks.32.ln_1.bias", "visual.transformer.resblocks.32.attn.in_proj_weight", "visual.transformer.resblocks.32.attn.in_proj_bias", "visual.transformer.resblocks.32.attn.out_proj.weight", "visual.transformer.resblocks.32.attn.out_proj.bias",
......
"transformer.resblocks.31.mlp.c_proj.weight", "transformer.resblocks.31.mlp.c_proj.bias".
size mismatch for positional_embedding: copying a param with shape torch.Size([77, 1024]) from checkpoint, the shape in current model is torch.Size([77, 1280]).
size mismatch for text_projection: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1280, 1280]).
......
I wonder if it is the problem of the model or something else.
When I ran the example code in README.md, I met a strange problem.
The error occured in the following line in File "ControlNet-XS\sgm\modules\encoders\modules.py", line 428:
To fix it, I downloaded the laion CLIP-ViT-H-14-laion2B-s32B-b79K model manually and put it in a directory, then I use the model:
Then I met the error I could not fix:
I wonder if it is the problem of the model or something else.