Open connorzl opened 9 months ago
Hi @connorzl ! I faced the same problem when testing the model. Have you solved the problem? Many thanks.
Hi, after following the setup instructions, I tested the model by running the following command:
PYTHONPATH=".":$PYTHONPATH python tools/visualize.py configs/finemogen/finemogen_t2m.py logs/finemogen/finemogen_t2m/latest.pth --text "a person is running quickly" --motion_length 120 --out "test.gif"
I received the following warnings below:
warnings.warn( load checkpoint from local path: logs/finemogen/finemogen_t2m/latest.pth The model and loaded state dict do not match exactly
missing keys in source state_dict: model.clip.positional_embedding, model.clip.text_projection, model.clip.logit_scale, model.clip.visual.class_embedding, model.clip.visual.positional_embedding, model.clip.visual.proj, model.clip.visual.conv1.weight, model.clip.visual.ln_pre.weight, model.clip.visual.ln_pre.bias, model.clip.visual.transformer.resblocks.0.attn.in_proj_weight, model.clip.visual.transformer.resblocks.0.attn.in_proj_bias, model.clip.visual.transformer.resblocks.0.attn.out_proj.weight, model.clip.visual.transformer.resblocks.0.attn.out_proj.bias, model.clip.visual.transformer.resblocks.0.ln_1.weight, model.clip.visual.transformer.resblocks.0.ln_1.bias, model.clip.visual.transformer.resblocks.0.mlp.c_fc.weight, model.clip.visual.transformer.resblocks.0.mlp.c_fc.bias, model.clip.visual.transformer.resblocks.0.mlp.c_proj.weight, model.clip.visual.transformer.resblocks.0.mlp.c_proj.bias, model.clip.visual.transformer.resblocks.0.ln_2.weight, model.clip.visual.transformer.resblocks.0.ln_2.bias, model.clip.visual.transformer.resblocks.1.attn.in_proj_weight, model.clip.visual.transformer.resblocks.1.attn.in_proj_bias, model.clip.visual.transformer.resblocks.1.attn.out_proj.weight, model.clip.visual.transformer.resblocks.1.attn.out_proj.bias, model.clip.visual.transformer.resblocks.1.ln_1.weight, model.clip.visual.transformer.resblocks.1.ln_1.bias, model.clip.visual.transformer.resblocks.1.mlp.c_fc.weight, model.clip.visual.transformer.resblocks.1.mlp.c_fc.bias, model.clip.visual.transformer.resblocks.1.mlp.c_proj.weight, model.clip.visual.transformer.resblocks.1.mlp.c_proj.bias, model.clip.visual.transformer.resblocks.1.ln_2.weight, model.clip.visual.transformer.resblocks.1.ln_2.bias, model.clip.visual.transformer.resblocks.2.attn.in_proj_weight, model.clip.visual.transformer.resblocks.2.attn.in_proj_bias, model.clip.visual.transformer.resblocks.2.attn.out_proj.weight, model.clip.visual.transformer.resblocks.2.attn.out_proj.bias, model.clip.visual.transformer.resblocks.2.ln_1.weight, model.clip.visual.transformer.resblocks.2.ln_1.bias, model.clip.visual.transformer.resblocks.2.mlp.c_fc.weight, model.clip.visual.transformer.resblocks.2.mlp.c_fc.bias, model.clip.visual.transformer.resblocks.2.mlp.c_proj.weight, model.clip.visual.transformer.resblocks.2.mlp.c_proj.bias, model.clip.visual.transformer.resblocks.2.ln_2.weight, model.clip.visual.transformer.resblocks.2.ln_2.bias, model.clip.visual.transformer.resblocks.3.attn.in_proj_weight, model.clip.visual.transformer.resblocks.3.attn.in_proj_bias, model.clip.visual.transformer.resblocks.3.attn.out_proj.weight, model.clip.visual.transformer.resblocks.3.attn.out_proj.bias, model.clip.visual.transformer.resblocks.3.ln_1.weight, model.clip.visual.transformer.resblocks.3.ln_1.bias, model.clip.visual.transformer.resblocks.3.mlp.c_fc.weight, model.clip.visual.transformer.resblocks.3.mlp.c_fc.bias, model.clip.visual.transformer.resblocks.3.mlp.c_proj.weight, model.clip.visual.transformer.resblocks.3.mlp.c_proj.bias, model.clip.visual.transformer.resblocks.3.ln_2.weight, model.clip.visual.transformer.resblocks.3.ln_2.bias, model.clip.visual.transformer.resblocks.4.attn.in_proj_weight, model.clip.visual.transformer.resblocks.4.attn.in_proj_bias, model.clip.visual.transformer.resblocks.4.attn.out_proj.weight, model.clip.visual.transformer.resblocks.4.attn.out_proj.bias, model.clip.visual.transformer.resblocks.4.ln_1.weight, model.clip.visual.transformer.resblocks.4.ln_1.bias, model.clip.visual.transformer.resblocks.4.mlp.c_fc.weight, model.clip.visual.transformer.resblocks.4.mlp.c_fc.bias, model.clip.visual.transformer.resblocks.4.mlp.c_proj.weight, model.clip.visual.transformer.resblocks.4.mlp.c_proj.bias, model.clip.visual.transformer.resblocks.4.ln_2.weight, model.clip.visual.transformer.resblocks.4.ln_2.bias, model.clip.visual.transformer.resblocks.5.attn.in_proj_weight, model.clip.visual.transformer.resblocks.5.attn.in_proj_bias, model.clip.visual.transformer.resblocks.5.attn.out_proj.weight, model.clip.visual.transformer.resblocks.5.attn.out_proj.bias, model.clip.visual.transformer.resblocks.5.ln_1.weight, model.clip.visual.transformer.resblocks.5.ln_1.bias, model.clip.visual.transformer.resblocks.5.mlp.c_fc.weight, model.clip.visual.transformer.resblocks.5.mlp.c_fc.bias, model.clip.visual.transformer.resblocks.5.mlp.c_proj.weight, model.clip.visual.transformer.resblocks.5.mlp.c_proj.bias, model.clip.visual.transformer.resblocks.5.ln_2.weight, model.clip.visual.transformer.resblocks.5.ln_2.bias, model.clip.visual.transformer.resblocks.6.attn.in_proj_weight, model.clip.visual.transformer.resblocks.6.attn.in_proj_bias, model.clip.visual.transformer.resblocks.6.attn.out_proj.weight, model.clip.visual.transformer.resblocks.6.attn.out_proj.bias, model.clip.visual.transformer.resblocks.6.ln_1.weight, model.clip.visual.transformer.resblocks.6.ln_1.bias, model.clip.visual.transformer.resblocks.6.mlp.c_fc.weight, model.clip.visual.transformer.resblocks.6.mlp.c_fc.bias, model.clip.visual.transformer.resblocks.6.mlp.c_proj.weight, model.clip.visual.transformer.resblocks.6.mlp.c_proj.bias, model.clip.visual.transformer.resblocks.6.ln_2.weight, model.clip.visual.transformer.resblocks.6.ln_2.bias, model.clip.visual.transformer.resblocks.7.attn.in_proj_weight, model.clip.visual.transformer.resblocks.7.attn.in_proj_bias, model.clip.visual.transformer.resblocks.7.attn.out_proj.weight, model.clip.visual.transformer.resblocks.7.attn.out_proj.bias, model.clip.visual.transformer.resblocks.7.ln_1.weight, model.clip.visual.transformer.resblocks.7.ln_1.bias, model.clip.visual.transformer.resblocks.7.mlp.c_fc.weight, model.clip.visual.transformer.resblocks.7.mlp.c_fc.bias, model.clip.visual.transformer.resblocks.7.mlp.c_proj.weight, model.clip.visual.transformer.resblocks.7.mlp.c_proj.bias, model.clip.visual.transformer.resblocks.7.ln_2.weight, model.clip.visual.transformer.resblocks.7.ln_2.bias, model.clip.visual.transformer.resblocks.8.attn.in_proj_weight, model.clip.visual.transformer.resblocks.8.attn.in_proj_bias, model.clip.visual.transformer.resblocks.8.attn.out_proj.weight, model.clip.visual.transformer.resblocks.8.attn.out_proj.bias, model.clip.visual.transformer.resblocks.8.ln_1.weight, model.clip.visual.transformer.resblocks.8.ln_1.bias, model.clip.visual.transformer.resblocks.8.mlp.c_fc.weight, model.clip.visual.transformer.resblocks.8.mlp.c_fc.bias, model.clip.visual.transformer.resblocks.8.mlp.c_proj.weight, model.clip.visual.transformer.resblocks.8.mlp.c_proj.bias, model.clip.visual.transformer.resblocks.8.ln_2.weight, model.clip.visual.transformer.resblocks.8.ln_2.bias, model.clip.visual.transformer.resblocks.9.attn.in_proj_weight, model.clip.visual.transformer.resblocks.9.attn.in_proj_bias, model.clip.visual.transformer.resblocks.9.attn.out_proj.weight, model.clip.visual.transformer.resblocks.9.attn.out_proj.bias, model.clip.visual.transformer.resblocks.9.ln_1.weight, model.clip.visual.transformer.resblocks.9.ln_1.bias, model.clip.visual.transformer.resblocks.9.mlp.c_fc.weight, model.clip.visual.transformer.resblocks.9.mlp.c_fc.bias, model.clip.visual.transformer.resblocks.9.mlp.c_proj.weight, model.clip.visual.transformer.resblocks.9.mlp.c_proj.bias, model.clip.visual.transformer.resblocks.9.ln_2.weight, model.clip.visual.transformer.resblocks.9.ln_2.bias, model.clip.visual.transformer.resblocks.10.attn.in_proj_weight, model.clip.visual.transformer.resblocks.10.attn.in_proj_bias, model.clip.visual.transformer.resblocks.10.attn.out_proj.weight, model.clip.visual.transformer.resblocks.10.attn.out_proj.bias, model.clip.visual.transformer.resblocks.10.ln_1.weight, model.clip.visual.transformer.resblocks.10.ln_1.bias, model.clip.visual.transformer.resblocks.10.mlp.c_fc.weight, model.clip.visual.transformer.resblocks.10.mlp.c_fc.bias, model.clip.visual.transformer.resblocks.10.mlp.c_proj.weight, model.clip.visual.transformer.resblocks.10.mlp.c_proj.bias, model.clip.visual.transformer.resblocks.10.ln_2.weight, model.clip.visual.transformer.resblocks.10.ln_2.bias, model.clip.visual.transformer.resblocks.11.attn.in_proj_weight, model.clip.visual.transformer.resblocks.11.attn.in_proj_bias, model.clip.visual.transformer.resblocks.11.attn.out_proj.weight, model.clip.visual.transformer.resblocks.11.attn.out_proj.bias, model.clip.visual.transformer.resblocks.11.ln_1.weight, model.clip.visual.transformer.resblocks.11.ln_1.bias, model.clip.visual.transformer.resblocks.11.mlp.c_fc.weight, model.clip.visual.transformer.resblocks.11.mlp.c_fc.bias, model.clip.visual.transformer.resblocks.11.mlp.c_proj.weight, model.clip.visual.transformer.resblocks.11.mlp.c_proj.bias, model.clip.visual.transformer.resblocks.11.ln_2.weight, model.clip.visual.transformer.resblocks.11.ln_2.bias, model.clip.visual.ln_post.weight, model.clip.visual.ln_post.bias, model.clip.transformer.resblocks.0.attn.in_proj_weight, model.clip.transformer.resblocks.0.attn.in_proj_bias, model.clip.transformer.resblocks.0.attn.out_proj.weight, model.clip.transformer.resblocks.0.attn.out_proj.bias, model.clip.transformer.resblocks.0.ln_1.weight, model.clip.transformer.resblocks.0.ln_1.bias, model.clip.transformer.resblocks.0.mlp.c_fc.weight, model.clip.transformer.resblocks.0.mlp.c_fc.bias, model.clip.transformer.resblocks.0.mlp.c_proj.weight, model.clip.transformer.resblocks.0.mlp.c_proj.bias, model.clip.transformer.resblocks.0.ln_2.weight, model.clip.transformer.resblocks.0.ln_2.bias, model.clip.transformer.resblocks.1.attn.in_proj_weight, model.clip.transformer.resblocks.1.attn.in_proj_bias, model.clip.transformer.resblocks.1.attn.out_proj.weight, model.clip.transformer.resblocks.1.attn.out_proj.bias, model.clip.transformer.resblocks.1.ln_1.weight, model.clip.transformer.resblocks.1.ln_1.bias, model.clip.transformer.resblocks.1.mlp.c_fc.weight, model.clip.transformer.resblocks.1.mlp.c_fc.bias, model.clip.transformer.resblocks.1.mlp.c_proj.weight, model.clip.transformer.resblocks.1.mlp.c_proj.bias, model.clip.transformer.resblocks.1.ln_2.weight, model.clip.transformer.resblocks.1.ln_2.bias, model.clip.transformer.resblocks.2.attn.in_proj_weight, model.clip.transformer.resblocks.2.attn.in_proj_bias, model.clip.transformer.resblocks.2.attn.out_proj.weight, model.clip.transformer.resblocks.2.attn.out_proj.bias, model.clip.transformer.resblocks.2.ln_1.weight, model.clip.transformer.resblocks.2.ln_1.bias, model.clip.transformer.resblocks.2.mlp.c_fc.weight, model.clip.transformer.resblocks.2.mlp.c_fc.bias, model.clip.transformer.resblocks.2.mlp.c_proj.weight, model.clip.transformer.resblocks.2.mlp.c_proj.bias, model.clip.transformer.resblocks.2.ln_2.weight, model.clip.transformer.resblocks.2.ln_2.bias, model.clip.transformer.resblocks.3.attn.in_proj_weight, model.clip.transformer.resblocks.3.attn.in_proj_bias, model.clip.transformer.resblocks.3.attn.out_proj.weight, model.clip.transformer.resblocks.3.attn.out_proj.bias, model.clip.transformer.resblocks.3.ln_1.weight, model.clip.transformer.resblocks.3.ln_1.bias, model.clip.transformer.resblocks.3.mlp.c_fc.weight, model.clip.transformer.resblocks.3.mlp.c_fc.bias, model.clip.transformer.resblocks.3.mlp.c_proj.weight, model.clip.transformer.resblocks.3.mlp.c_proj.bias, model.clip.transformer.resblocks.3.ln_2.weight, model.clip.transformer.resblocks.3.ln_2.bias, model.clip.transformer.resblocks.4.attn.in_proj_weight, model.clip.transformer.resblocks.4.attn.in_proj_bias, model.clip.transformer.resblocks.4.attn.out_proj.weight, model.clip.transformer.resblocks.4.attn.out_proj.bias, model.clip.transformer.resblocks.4.ln_1.weight, model.clip.transformer.resblocks.4.ln_1.bias, model.clip.transformer.resblocks.4.mlp.c_fc.weight, model.clip.transformer.resblocks.4.mlp.c_fc.bias, model.clip.transformer.resblocks.4.mlp.c_proj.weight, model.clip.transformer.resblocks.4.mlp.c_proj.bias, model.clip.transformer.resblocks.4.ln_2.weight, model.clip.transformer.resblocks.4.ln_2.bias, model.clip.transformer.resblocks.5.attn.in_proj_weight, model.clip.transformer.resblocks.5.attn.in_proj_bias, model.clip.transformer.resblocks.5.attn.out_proj.weight, model.clip.transformer.resblocks.5.attn.out_proj.bias, model.clip.transformer.resblocks.5.ln_1.weight, model.clip.transformer.resblocks.5.ln_1.bias, model.clip.transformer.resblocks.5.mlp.c_fc.weight, model.clip.transformer.resblocks.5.mlp.c_fc.bias, model.clip.transformer.resblocks.5.mlp.c_proj.weight, model.clip.transformer.resblocks.5.mlp.c_proj.bias, model.clip.transformer.resblocks.5.ln_2.weight, model.clip.transformer.resblocks.5.ln_2.bias, model.clip.transformer.resblocks.6.attn.in_proj_weight, model.clip.transformer.resblocks.6.attn.in_proj_bias, model.clip.transformer.resblocks.6.attn.out_proj.weight, model.clip.transformer.resblocks.6.attn.out_proj.bias, model.clip.transformer.resblocks.6.ln_1.weight, model.clip.transformer.resblocks.6.ln_1.bias, model.clip.transformer.resblocks.6.mlp.c_fc.weight, model.clip.transformer.resblocks.6.mlp.c_fc.bias, model.clip.transformer.resblocks.6.mlp.c_proj.weight, model.clip.transformer.resblocks.6.mlp.c_proj.bias, model.clip.transformer.resblocks.6.ln_2.weight, model.clip.transformer.resblocks.6.ln_2.bias, model.clip.transformer.resblocks.7.attn.in_proj_weight, model.clip.transformer.resblocks.7.attn.in_proj_bias, model.clip.transformer.resblocks.7.attn.out_proj.weight, model.clip.transformer.resblocks.7.attn.out_proj.bias, model.clip.transformer.resblocks.7.ln_1.weight, model.clip.transformer.resblocks.7.ln_1.bias, model.clip.transformer.resblocks.7.mlp.c_fc.weight, model.clip.transformer.resblocks.7.mlp.c_fc.bias, model.clip.transformer.resblocks.7.mlp.c_proj.weight, model.clip.transformer.resblocks.7.mlp.c_proj.bias, model.clip.transformer.resblocks.7.ln_2.weight, model.clip.transformer.resblocks.7.ln_2.bias, model.clip.transformer.resblocks.8.attn.in_proj_weight, model.clip.transformer.resblocks.8.attn.in_proj_bias, model.clip.transformer.resblocks.8.attn.out_proj.weight, model.clip.transformer.resblocks.8.attn.out_proj.bias, model.clip.transformer.resblocks.8.ln_1.weight, model.clip.transformer.resblocks.8.ln_1.bias, model.clip.transformer.resblocks.8.mlp.c_fc.weight, model.clip.transformer.resblocks.8.mlp.c_fc.bias, model.clip.transformer.resblocks.8.mlp.c_proj.weight, model.clip.transformer.resblocks.8.mlp.c_proj.bias, model.clip.transformer.resblocks.8.ln_2.weight, model.clip.transformer.resblocks.8.ln_2.bias, model.clip.transformer.resblocks.9.attn.in_proj_weight, model.clip.transformer.resblocks.9.attn.in_proj_bias, model.clip.transformer.resblocks.9.attn.out_proj.weight, model.clip.transformer.resblocks.9.attn.out_proj.bias, model.clip.transformer.resblocks.9.ln_1.weight, model.clip.transformer.resblocks.9.ln_1.bias, model.clip.transformer.resblocks.9.mlp.c_fc.weight, model.clip.transformer.resblocks.9.mlp.c_fc.bias, model.clip.transformer.resblocks.9.mlp.c_proj.weight, model.clip.transformer.resblocks.9.mlp.c_proj.bias, model.clip.transformer.resblocks.9.ln_2.weight, model.clip.transformer.resblocks.9.ln_2.bias, model.clip.transformer.resblocks.10.attn.in_proj_weight, model.clip.transformer.resblocks.10.attn.in_proj_bias, model.clip.transformer.resblocks.10.attn.out_proj.weight, model.clip.transformer.resblocks.10.attn.out_proj.bias, model.clip.transformer.resblocks.10.ln_1.weight, model.clip.transformer.resblocks.10.ln_1.bias, model.clip.transformer.resblocks.10.mlp.c_fc.weight, model.clip.transformer.resblocks.10.mlp.c_fc.bias, model.clip.transformer.resblocks.10.mlp.c_proj.weight, model.clip.transformer.resblocks.10.mlp.c_proj.bias, model.clip.transformer.resblocks.10.ln_2.weight, model.clip.transformer.resblocks.10.ln_2.bias, model.clip.transformer.resblocks.11.attn.in_proj_weight, model.clip.transformer.resblocks.11.attn.in_proj_bias, model.clip.transformer.resblocks.11.attn.out_proj.weight, model.clip.transformer.resblocks.11.attn.out_proj.bias, model.clip.transformer.resblocks.11.ln_1.weight, model.clip.transformer.resblocks.11.ln_1.bias, model.clip.transformer.resblocks.11.mlp.c_fc.weight, model.clip.transformer.resblocks.11.mlp.c_fc.bias, model.clip.transformer.resblocks.11.mlp.c_proj.weight, model.clip.transformer.resblocks.11.mlp.c_proj.bias, model.clip.transformer.resblocks.11.ln_2.weight, model.clip.transformer.resblocks.11.ln_2.bias, model.clip.token_embedding.weight, model.clip.ln_final.weight, model.clip.ln_final.bias
I also had to comment out the following lines to get the command to run, otherwise I would get plotting errors.
However, the output seems to be a blank gif with a text caption.
You can ignore this warning. In the current implementation, we don't store the frozen CLIP weights during training to reduce the size of checkpoint file. Therefore, during loading state_dict
, pytorch can not find the corresponding weights. Instead, we will load this part of weights during initialization.
May I know your detailed log of the plotting error? It seems that you can not remove these two lines.
Hi @connorzl ! I faced the same problem when testing the model. Have you solved the problem? Many thanks.
You can ignore this warning. In the current implementation, we don't store the frozen CLIP weights during training to reduce the size of checkpoint file. Therefore, during loading state_dict
, pytorch can not find the corresponding weights. Instead, we will load this part of weights during initialization.
Hi @connorzl ! I faced the same problem when testing the model. Have you solved the problem? Many thanks.
Hi, this issue come from the version of matplotlib, you can 'pip install matplotlib==3.4.3'. I have solved by this way.
Hi, after following the setup instructions, I tested the model by running the following command:
PYTHONPATH=".":$PYTHONPATH python tools/visualize.py configs/finemogen/finemogen_t2m.py logs/finemogen/finemogen_t2m/latest.pth --text "a person is running quickly" --motion_length 120 --out "test.gif"
I received the following warnings below:
warnings.warn( load checkpoint from local path: logs/finemogen/finemogen_t2m/latest.pth The model and loaded state dict do not match exactly
missing keys in source state_dict: model.clip.positional_embedding, model.clip.text_projection, model.clip.logit_scale, model.clip.visual.class_embedding, model.clip.visual.positional_embedding, model.clip.visual.proj, model.clip.visual.conv1.weight, model.clip.visual.ln_pre.weight, model.clip.visual.ln_pre.bias, model.clip.visual.transformer.resblocks.0.attn.in_proj_weight, model.clip.visual.transformer.resblocks.0.attn.in_proj_bias, model.clip.visual.transformer.resblocks.0.attn.out_proj.weight, model.clip.visual.transformer.resblocks.0.attn.out_proj.bias, model.clip.visual.transformer.resblocks.0.ln_1.weight, model.clip.visual.transformer.resblocks.0.ln_1.bias, model.clip.visual.transformer.resblocks.0.mlp.c_fc.weight, model.clip.visual.transformer.resblocks.0.mlp.c_fc.bias, model.clip.visual.transformer.resblocks.0.mlp.c_proj.weight, model.clip.visual.transformer.resblocks.0.mlp.c_proj.bias, model.clip.visual.transformer.resblocks.0.ln_2.weight, model.clip.visual.transformer.resblocks.0.ln_2.bias, model.clip.visual.transformer.resblocks.1.attn.in_proj_weight, model.clip.visual.transformer.resblocks.1.attn.in_proj_bias, model.clip.visual.transformer.resblocks.1.attn.out_proj.weight, model.clip.visual.transformer.resblocks.1.attn.out_proj.bias, model.clip.visual.transformer.resblocks.1.ln_1.weight, model.clip.visual.transformer.resblocks.1.ln_1.bias, model.clip.visual.transformer.resblocks.1.mlp.c_fc.weight, model.clip.visual.transformer.resblocks.1.mlp.c_fc.bias, model.clip.visual.transformer.resblocks.1.mlp.c_proj.weight, model.clip.visual.transformer.resblocks.1.mlp.c_proj.bias, model.clip.visual.transformer.resblocks.1.ln_2.weight, model.clip.visual.transformer.resblocks.1.ln_2.bias, model.clip.visual.transformer.resblocks.2.attn.in_proj_weight, model.clip.visual.transformer.resblocks.2.attn.in_proj_bias, model.clip.visual.transformer.resblocks.2.attn.out_proj.weight, model.clip.visual.transformer.resblocks.2.attn.out_proj.bias, model.clip.visual.transformer.resblocks.2.ln_1.weight, model.clip.visual.transformer.resblocks.2.ln_1.bias, model.clip.visual.transformer.resblocks.2.mlp.c_fc.weight, model.clip.visual.transformer.resblocks.2.mlp.c_fc.bias, model.clip.visual.transformer.resblocks.2.mlp.c_proj.weight, model.clip.visual.transformer.resblocks.2.mlp.c_proj.bias, model.clip.visual.transformer.resblocks.2.ln_2.weight, model.clip.visual.transformer.resblocks.2.ln_2.bias, model.clip.visual.transformer.resblocks.3.attn.in_proj_weight, model.clip.visual.transformer.resblocks.3.attn.in_proj_bias, model.clip.visual.transformer.resblocks.3.attn.out_proj.weight, model.clip.visual.transformer.resblocks.3.attn.out_proj.bias, model.clip.visual.transformer.resblocks.3.ln_1.weight, model.clip.visual.transformer.resblocks.3.ln_1.bias, model.clip.visual.transformer.resblocks.3.mlp.c_fc.weight, model.clip.visual.transformer.resblocks.3.mlp.c_fc.bias, model.clip.visual.transformer.resblocks.3.mlp.c_proj.weight, model.clip.visual.transformer.resblocks.3.mlp.c_proj.bias, model.clip.visual.transformer.resblocks.3.ln_2.weight, model.clip.visual.transformer.resblocks.3.ln_2.bias, model.clip.visual.transformer.resblocks.4.attn.in_proj_weight, model.clip.visual.transformer.resblocks.4.attn.in_proj_bias, model.clip.visual.transformer.resblocks.4.attn.out_proj.weight, model.clip.visual.transformer.resblocks.4.attn.out_proj.bias, model.clip.visual.transformer.resblocks.4.ln_1.weight, model.clip.visual.transformer.resblocks.4.ln_1.bias, model.clip.visual.transformer.resblocks.4.mlp.c_fc.weight, model.clip.visual.transformer.resblocks.4.mlp.c_fc.bias, model.clip.visual.transformer.resblocks.4.mlp.c_proj.weight, model.clip.visual.transformer.resblocks.4.mlp.c_proj.bias, model.clip.visual.transformer.resblocks.4.ln_2.weight, model.clip.visual.transformer.resblocks.4.ln_2.bias, model.clip.visual.transformer.resblocks.5.attn.in_proj_weight, model.clip.visual.transformer.resblocks.5.attn.in_proj_bias, model.clip.visual.transformer.resblocks.5.attn.out_proj.weight, model.clip.visual.transformer.resblocks.5.attn.out_proj.bias, model.clip.visual.transformer.resblocks.5.ln_1.weight, model.clip.visual.transformer.resblocks.5.ln_1.bias, model.clip.visual.transformer.resblocks.5.mlp.c_fc.weight, model.clip.visual.transformer.resblocks.5.mlp.c_fc.bias, model.clip.visual.transformer.resblocks.5.mlp.c_proj.weight, model.clip.visual.transformer.resblocks.5.mlp.c_proj.bias, model.clip.visual.transformer.resblocks.5.ln_2.weight, model.clip.visual.transformer.resblocks.5.ln_2.bias, model.clip.visual.transformer.resblocks.6.attn.in_proj_weight, model.clip.visual.transformer.resblocks.6.attn.in_proj_bias, model.clip.visual.transformer.resblocks.6.attn.out_proj.weight, model.clip.visual.transformer.resblocks.6.attn.out_proj.bias, model.clip.visual.transformer.resblocks.6.ln_1.weight, model.clip.visual.transformer.resblocks.6.ln_1.bias, model.clip.visual.transformer.resblocks.6.mlp.c_fc.weight, model.clip.visual.transformer.resblocks.6.mlp.c_fc.bias, model.clip.visual.transformer.resblocks.6.mlp.c_proj.weight, model.clip.visual.transformer.resblocks.6.mlp.c_proj.bias, model.clip.visual.transformer.resblocks.6.ln_2.weight, model.clip.visual.transformer.resblocks.6.ln_2.bias, model.clip.visual.transformer.resblocks.7.attn.in_proj_weight, model.clip.visual.transformer.resblocks.7.attn.in_proj_bias, model.clip.visual.transformer.resblocks.7.attn.out_proj.weight, model.clip.visual.transformer.resblocks.7.attn.out_proj.bias, model.clip.visual.transformer.resblocks.7.ln_1.weight, model.clip.visual.transformer.resblocks.7.ln_1.bias, model.clip.visual.transformer.resblocks.7.mlp.c_fc.weight, model.clip.visual.transformer.resblocks.7.mlp.c_fc.bias, model.clip.visual.transformer.resblocks.7.mlp.c_proj.weight, model.clip.visual.transformer.resblocks.7.mlp.c_proj.bias, model.clip.visual.transformer.resblocks.7.ln_2.weight, model.clip.visual.transformer.resblocks.7.ln_2.bias, model.clip.visual.transformer.resblocks.8.attn.in_proj_weight, model.clip.visual.transformer.resblocks.8.attn.in_proj_bias, model.clip.visual.transformer.resblocks.8.attn.out_proj.weight, model.clip.visual.transformer.resblocks.8.attn.out_proj.bias, model.clip.visual.transformer.resblocks.8.ln_1.weight, model.clip.visual.transformer.resblocks.8.ln_1.bias, model.clip.visual.transformer.resblocks.8.mlp.c_fc.weight, model.clip.visual.transformer.resblocks.8.mlp.c_fc.bias, model.clip.visual.transformer.resblocks.8.mlp.c_proj.weight, model.clip.visual.transformer.resblocks.8.mlp.c_proj.bias, model.clip.visual.transformer.resblocks.8.ln_2.weight, model.clip.visual.transformer.resblocks.8.ln_2.bias, model.clip.visual.transformer.resblocks.9.attn.in_proj_weight, model.clip.visual.transformer.resblocks.9.attn.in_proj_bias, model.clip.visual.transformer.resblocks.9.attn.out_proj.weight, model.clip.visual.transformer.resblocks.9.attn.out_proj.bias, model.clip.visual.transformer.resblocks.9.ln_1.weight, model.clip.visual.transformer.resblocks.9.ln_1.bias, model.clip.visual.transformer.resblocks.9.mlp.c_fc.weight, model.clip.visual.transformer.resblocks.9.mlp.c_fc.bias, model.clip.visual.transformer.resblocks.9.mlp.c_proj.weight, model.clip.visual.transformer.resblocks.9.mlp.c_proj.bias, model.clip.visual.transformer.resblocks.9.ln_2.weight, model.clip.visual.transformer.resblocks.9.ln_2.bias, model.clip.visual.transformer.resblocks.10.attn.in_proj_weight, model.clip.visual.transformer.resblocks.10.attn.in_proj_bias, model.clip.visual.transformer.resblocks.10.attn.out_proj.weight, model.clip.visual.transformer.resblocks.10.attn.out_proj.bias, model.clip.visual.transformer.resblocks.10.ln_1.weight, model.clip.visual.transformer.resblocks.10.ln_1.bias, model.clip.visual.transformer.resblocks.10.mlp.c_fc.weight, model.clip.visual.transformer.resblocks.10.mlp.c_fc.bias, model.clip.visual.transformer.resblocks.10.mlp.c_proj.weight, model.clip.visual.transformer.resblocks.10.mlp.c_proj.bias, model.clip.visual.transformer.resblocks.10.ln_2.weight, model.clip.visual.transformer.resblocks.10.ln_2.bias, model.clip.visual.transformer.resblocks.11.attn.in_proj_weight, model.clip.visual.transformer.resblocks.11.attn.in_proj_bias, model.clip.visual.transformer.resblocks.11.attn.out_proj.weight, model.clip.visual.transformer.resblocks.11.attn.out_proj.bias, model.clip.visual.transformer.resblocks.11.ln_1.weight, model.clip.visual.transformer.resblocks.11.ln_1.bias, model.clip.visual.transformer.resblocks.11.mlp.c_fc.weight, model.clip.visual.transformer.resblocks.11.mlp.c_fc.bias, model.clip.visual.transformer.resblocks.11.mlp.c_proj.weight, model.clip.visual.transformer.resblocks.11.mlp.c_proj.bias, model.clip.visual.transformer.resblocks.11.ln_2.weight, model.clip.visual.transformer.resblocks.11.ln_2.bias, model.clip.visual.ln_post.weight, model.clip.visual.ln_post.bias, model.clip.transformer.resblocks.0.attn.in_proj_weight, model.clip.transformer.resblocks.0.attn.in_proj_bias, model.clip.transformer.resblocks.0.attn.out_proj.weight, model.clip.transformer.resblocks.0.attn.out_proj.bias, model.clip.transformer.resblocks.0.ln_1.weight, model.clip.transformer.resblocks.0.ln_1.bias, model.clip.transformer.resblocks.0.mlp.c_fc.weight, model.clip.transformer.resblocks.0.mlp.c_fc.bias, model.clip.transformer.resblocks.0.mlp.c_proj.weight, model.clip.transformer.resblocks.0.mlp.c_proj.bias, model.clip.transformer.resblocks.0.ln_2.weight, model.clip.transformer.resblocks.0.ln_2.bias, model.clip.transformer.resblocks.1.attn.in_proj_weight, model.clip.transformer.resblocks.1.attn.in_proj_bias, model.clip.transformer.resblocks.1.attn.out_proj.weight, model.clip.transformer.resblocks.1.attn.out_proj.bias, model.clip.transformer.resblocks.1.ln_1.weight, model.clip.transformer.resblocks.1.ln_1.bias, model.clip.transformer.resblocks.1.mlp.c_fc.weight, model.clip.transformer.resblocks.1.mlp.c_fc.bias, model.clip.transformer.resblocks.1.mlp.c_proj.weight, model.clip.transformer.resblocks.1.mlp.c_proj.bias, model.clip.transformer.resblocks.1.ln_2.weight, model.clip.transformer.resblocks.1.ln_2.bias, model.clip.transformer.resblocks.2.attn.in_proj_weight, model.clip.transformer.resblocks.2.attn.in_proj_bias, model.clip.transformer.resblocks.2.attn.out_proj.weight, model.clip.transformer.resblocks.2.attn.out_proj.bias, model.clip.transformer.resblocks.2.ln_1.weight, model.clip.transformer.resblocks.2.ln_1.bias, model.clip.transformer.resblocks.2.mlp.c_fc.weight, model.clip.transformer.resblocks.2.mlp.c_fc.bias, model.clip.transformer.resblocks.2.mlp.c_proj.weight, model.clip.transformer.resblocks.2.mlp.c_proj.bias, model.clip.transformer.resblocks.2.ln_2.weight, model.clip.transformer.resblocks.2.ln_2.bias, model.clip.transformer.resblocks.3.attn.in_proj_weight, model.clip.transformer.resblocks.3.attn.in_proj_bias, model.clip.transformer.resblocks.3.attn.out_proj.weight, model.clip.transformer.resblocks.3.attn.out_proj.bias, model.clip.transformer.resblocks.3.ln_1.weight, model.clip.transformer.resblocks.3.ln_1.bias, model.clip.transformer.resblocks.3.mlp.c_fc.weight, model.clip.transformer.resblocks.3.mlp.c_fc.bias, model.clip.transformer.resblocks.3.mlp.c_proj.weight, model.clip.transformer.resblocks.3.mlp.c_proj.bias, model.clip.transformer.resblocks.3.ln_2.weight, model.clip.transformer.resblocks.3.ln_2.bias, model.clip.transformer.resblocks.4.attn.in_proj_weight, model.clip.transformer.resblocks.4.attn.in_proj_bias, model.clip.transformer.resblocks.4.attn.out_proj.weight, model.clip.transformer.resblocks.4.attn.out_proj.bias, model.clip.transformer.resblocks.4.ln_1.weight, model.clip.transformer.resblocks.4.ln_1.bias, model.clip.transformer.resblocks.4.mlp.c_fc.weight, model.clip.transformer.resblocks.4.mlp.c_fc.bias, model.clip.transformer.resblocks.4.mlp.c_proj.weight, model.clip.transformer.resblocks.4.mlp.c_proj.bias, model.clip.transformer.resblocks.4.ln_2.weight, model.clip.transformer.resblocks.4.ln_2.bias, model.clip.transformer.resblocks.5.attn.in_proj_weight, model.clip.transformer.resblocks.5.attn.in_proj_bias, model.clip.transformer.resblocks.5.attn.out_proj.weight, model.clip.transformer.resblocks.5.attn.out_proj.bias, model.clip.transformer.resblocks.5.ln_1.weight, model.clip.transformer.resblocks.5.ln_1.bias, model.clip.transformer.resblocks.5.mlp.c_fc.weight, model.clip.transformer.resblocks.5.mlp.c_fc.bias, model.clip.transformer.resblocks.5.mlp.c_proj.weight, model.clip.transformer.resblocks.5.mlp.c_proj.bias, model.clip.transformer.resblocks.5.ln_2.weight, model.clip.transformer.resblocks.5.ln_2.bias, model.clip.transformer.resblocks.6.attn.in_proj_weight, model.clip.transformer.resblocks.6.attn.in_proj_bias, model.clip.transformer.resblocks.6.attn.out_proj.weight, model.clip.transformer.resblocks.6.attn.out_proj.bias, model.clip.transformer.resblocks.6.ln_1.weight, model.clip.transformer.resblocks.6.ln_1.bias, model.clip.transformer.resblocks.6.mlp.c_fc.weight, model.clip.transformer.resblocks.6.mlp.c_fc.bias, model.clip.transformer.resblocks.6.mlp.c_proj.weight, model.clip.transformer.resblocks.6.mlp.c_proj.bias, model.clip.transformer.resblocks.6.ln_2.weight, model.clip.transformer.resblocks.6.ln_2.bias, model.clip.transformer.resblocks.7.attn.in_proj_weight, model.clip.transformer.resblocks.7.attn.in_proj_bias, model.clip.transformer.resblocks.7.attn.out_proj.weight, model.clip.transformer.resblocks.7.attn.out_proj.bias, model.clip.transformer.resblocks.7.ln_1.weight, model.clip.transformer.resblocks.7.ln_1.bias, model.clip.transformer.resblocks.7.mlp.c_fc.weight, model.clip.transformer.resblocks.7.mlp.c_fc.bias, model.clip.transformer.resblocks.7.mlp.c_proj.weight, model.clip.transformer.resblocks.7.mlp.c_proj.bias, model.clip.transformer.resblocks.7.ln_2.weight, model.clip.transformer.resblocks.7.ln_2.bias, model.clip.transformer.resblocks.8.attn.in_proj_weight, model.clip.transformer.resblocks.8.attn.in_proj_bias, model.clip.transformer.resblocks.8.attn.out_proj.weight, model.clip.transformer.resblocks.8.attn.out_proj.bias, model.clip.transformer.resblocks.8.ln_1.weight, model.clip.transformer.resblocks.8.ln_1.bias, model.clip.transformer.resblocks.8.mlp.c_fc.weight, model.clip.transformer.resblocks.8.mlp.c_fc.bias, model.clip.transformer.resblocks.8.mlp.c_proj.weight, model.clip.transformer.resblocks.8.mlp.c_proj.bias, model.clip.transformer.resblocks.8.ln_2.weight, model.clip.transformer.resblocks.8.ln_2.bias, model.clip.transformer.resblocks.9.attn.in_proj_weight, model.clip.transformer.resblocks.9.attn.in_proj_bias, model.clip.transformer.resblocks.9.attn.out_proj.weight, model.clip.transformer.resblocks.9.attn.out_proj.bias, model.clip.transformer.resblocks.9.ln_1.weight, model.clip.transformer.resblocks.9.ln_1.bias, model.clip.transformer.resblocks.9.mlp.c_fc.weight, model.clip.transformer.resblocks.9.mlp.c_fc.bias, model.clip.transformer.resblocks.9.mlp.c_proj.weight, model.clip.transformer.resblocks.9.mlp.c_proj.bias, model.clip.transformer.resblocks.9.ln_2.weight, model.clip.transformer.resblocks.9.ln_2.bias, model.clip.transformer.resblocks.10.attn.in_proj_weight, model.clip.transformer.resblocks.10.attn.in_proj_bias, model.clip.transformer.resblocks.10.attn.out_proj.weight, model.clip.transformer.resblocks.10.attn.out_proj.bias, model.clip.transformer.resblocks.10.ln_1.weight, model.clip.transformer.resblocks.10.ln_1.bias, model.clip.transformer.resblocks.10.mlp.c_fc.weight, model.clip.transformer.resblocks.10.mlp.c_fc.bias, model.clip.transformer.resblocks.10.mlp.c_proj.weight, model.clip.transformer.resblocks.10.mlp.c_proj.bias, model.clip.transformer.resblocks.10.ln_2.weight, model.clip.transformer.resblocks.10.ln_2.bias, model.clip.transformer.resblocks.11.attn.in_proj_weight, model.clip.transformer.resblocks.11.attn.in_proj_bias, model.clip.transformer.resblocks.11.attn.out_proj.weight, model.clip.transformer.resblocks.11.attn.out_proj.bias, model.clip.transformer.resblocks.11.ln_1.weight, model.clip.transformer.resblocks.11.ln_1.bias, model.clip.transformer.resblocks.11.mlp.c_fc.weight, model.clip.transformer.resblocks.11.mlp.c_fc.bias, model.clip.transformer.resblocks.11.mlp.c_proj.weight, model.clip.transformer.resblocks.11.mlp.c_proj.bias, model.clip.transformer.resblocks.11.ln_2.weight, model.clip.transformer.resblocks.11.ln_2.bias, model.clip.token_embedding.weight, model.clip.ln_final.weight, model.clip.ln_final.bias
I also had to comment out the following lines to get the command to run, otherwise I would get plotting errors.
However, the output seems to be a blank gif with a text caption.