Closed MikeQ-hash closed 1 year ago
Hi @MikeQ-hash, thank you for using mmf :)
We are currently working to consolidate all model interfaces. Do you mind sharing what your model folder contains under: $DATA_DIR (typically ~/.cache/torch/mmf/data/models/visual_bert.pretrained.hateful_memes
).
In that folder, you'd need both the model file and also a config - for example:
Hi @ytsheng , thank you for your reply. I have both of these files available. I am just not certain how to properly call them in the python code. I am trying to do this with mmbt first because it is easier to check directly from image (visual bert requires additional pre-processing). My code is based on the posted notebook:
from mmf.utils.env import setup_imports
setup_imports()
import matplotlib.pyplot as plt
import requests
import torch
from PIL import Image
from mmf.common.registry import registry
import pdb
from mmf.models.mmbt import MMBT
#from mmf.models.visual_bert import VisualBERT
filename = 'mmbt_stuff/save/mmbt_final.pth'
model = MMBT.from_pretrained("mmbt.hateful_memes.images")
checkpoint = torch.load('mmbt_stuff/save/mmbt_final.pth')
model.load_state_dict(checkpoint)
optimizer.load_state_dict(checkpoint)
image_url = "https://i.imgur.com/tEcsk5q.jpg"
text = "Something"
output = model.classify(image_url, text)
plt.imshow(Image.open(requests.get(image_url, stream=True).raw))
plt.axis("off")
plt.show()
hateful = "Yes" if output["label"] == 1 else "No"
print("Hateful as per the model?", hateful)
print(f"Model's confidence: {output['confidence'] * 100:.3f}%")
This is the error message:
Missing keys ['model.bert.mmbt.transformer.embeddings.position_ids'] in the checkpoint.
If this is not your checkpoint, please open up an issue on MMF GitHub.
Unexpected keys if any: []
Traceback (most recent call last):
File "final.py", line 15, in <module>
model.load_state_dict(checkpoint)
File "/home/michael/miniconda3/envs/hackathon2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1045, in load_state_dict
self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for MMBTGridHMInterface:
Missing key(s) in state_dict: "model.model.bert.mmbt.transformer.embeddings.position_ids", "model.model.bert.mmbt.transformer.embeddings.word_embeddings.weight", "model.model.bert.mmbt.transformer.embeddings.position_embeddings.weight", "model.model.bert.mmbt.transformer.embeddings.token_type_embeddings.weight", "model.model.bert.mmbt.transformer.embeddings.LayerNorm.weight", "model.model.bert.mmbt.transformer.embeddings.LayerNorm.bias", "model.model.bert.mmbt.transformer.encoder.layer.0.attention.self.query.weight", "model.model.bert.mmbt.transformer.encoder.layer.0.attention.self.query.bias", "model.model.bert.mmbt.transformer.encoder.layer.0.attention.self.key.weight", "model.model.bert.mmbt.transformer.encoder.layer.0.attention.self.key.bias", "model.model.bert.mmbt.transformer.encoder.layer.0.attention.self.value.weight", "model.model.bert.mmbt.transformer.encoder.layer.0.attention.self.value.bias", "model.model.bert.mmbt.transformer.encoder.layer.0.attention.output.dense.weight", "model.model.bert.mmbt.transformer.encoder.layer.0.attention.output.dense.bias", "model.model.bert.mmbt.transformer.encoder.layer.0.attention.output.LayerNorm.weight", "model.model.bert.mmbt.transformer.encoder.layer.0.attention.output.LayerNorm.bias", "model.model.bert.mmbt.transformer.encoder.layer.0.intermediate.dense.weight", "model.model.bert.mmbt.transformer.encoder.layer.0.intermediate.dense.bias", "model.model.bert.mmbt.transformer.encoder.layer.0.output.dense.weight", "model.model.bert.mmbt.transformer.encoder.layer.0.output.dense.bias", "model.model.bert.mmbt.transformer.encoder.layer.0.output.LayerNorm.weight", "model.model.bert.mmbt.transformer.encoder.layer.0.output.LayerNorm.bias", "model.model.bert.mmbt.transformer.encoder.layer.1.attention.self.query.weight", "model.model.bert.mmbt.transformer.encoder.layer.1.attention.self.query.bias", "model.model.bert.mmbt.transformer.encoder.layer.1.attention.self.key.weight", "model.model.bert.mmbt.transformer.encoder.layer.1.attention.self.key.bias", "model.model.bert.mmbt.transformer.encoder.layer.1.attention.self.value.weight", "model.model.bert.mmbt.transformer.encoder.layer.1.attention.self.value.bias", "model.model.bert.mmbt.transformer.encoder.layer.1.attention.output.dense.weight", "model.model.bert.mmbt.transformer.encoder.layer.1.attention.output.dense.bias", "model.model.bert.mmbt.transformer.encoder.layer.1.attention.output.LayerNorm.weight", "model.model.bert.mmbt.transformer.encoder.layer.1.attention.output.LayerNorm.bias", "model.model.bert.mmbt.transformer.encoder.layer.1.intermediate.dense.weight", "model.model.bert.mmbt.transformer.encoder.layer.1.intermediate.dense.bias", "model.model.bert.mmbt.transformer.encoder.layer.1.output.dense.weight", "model.model.bert.mmbt.transformer.encoder.layer.1.output.dense.bias", "model.model.bert.mmbt.transformer.encoder.layer.1.output.LayerNorm.weight", "model.model.bert.mmbt.transformer.encoder.layer.1.output.LayerNorm.bias", "model.model.bert.mmbt.transformer.encoder.layer.2.attention.self.query.weight", "model.model.bert.mmbt.transformer.encoder.layer.2.attention.self.query.bias", "model.model.bert.mmbt.transformer.encoder.layer.2.attention.self.key.weight", "model.model.bert.mmbt.transformer.encoder.layer.2.attention.self.key.bias", "model.model.bert.mmbt.transformer.encoder.layer.2.attention.self.value.weight", "model.model.bert.mmbt.transformer.encoder.layer.2.attention.self.value.bias", "model.model.bert.mmbt.transformer.encoder.layer.2.attention.output.dense.weight", "model.model.bert.mmbt.transformer.encoder.layer.2.attention.output.dense.bias", "model.model.bert.mmbt.transformer.encoder.layer.2.attention.output.LayerNorm.weight", "model.model.bert.mmbt.transformer.encoder.layer.2.attention.output.LayerNorm.bias", "model.model.bert.mmbt.transformer.encoder.layer.2.intermediate.dense.weight", "model.model.bert.mmbt.transformer.encoder.layer.2.intermediate.dense.bias", "model.model.bert.mmbt.transformer.encoder.layer.2.output.dense.weight", "model.model.bert.mmbt.transformer.encoder.layer.2.output.dense.bias", "model.model.bert.mmbt.transformer.encoder.layer.2.output.LayerNorm.weight", "model.model.bert.mmbt.transformer.encoder.layer.2.output.LayerNorm.bias", "model.model.bert.mmbt.transformer.encoder.layer.3.attention.self.query.weight", "model.model.bert.mmbt.transformer.encoder.layer.3.attention.self.query.bias", "model.model.bert.mmbt.transformer.encoder.layer.3.attention.self.key.weight", "model.model.bert.mmbt.transformer.encoder.layer.3.attention.self.key.bias", "model.model.bert.mmbt.transformer.encoder.layer.3.attention.self.value.weight", "model.model.bert.mmbt.transformer.encoder.layer.3.attention.self.value.bias", "model.model.bert.mmbt.transformer.encoder.layer.3.attention.output.dense.weight", "model.model.bert.mmbt.transformer.encoder.layer.3.attention.output.dense.bias", "model.model.bert.mmbt.transformer.encoder.layer.3.attention.output.LayerNorm.weight", "model.model.bert.mmbt.transformer.encoder.layer.3.attention.output.LayerNorm.bias", "model.model.bert.mmbt.transformer.encoder.layer.3.intermediate.dense.weight", "model.model.bert.mmbt.transformer.encoder.layer.3.intermediate.dense.bias", "model.model.bert.mmbt.transformer.encoder.layer.3.output.dense.weight", "model.model.bert.mmbt.transformer.encoder.layer.3.output.dense.bias", "model.model.bert.mmbt.transformer.encoder.layer.3.output.LayerNorm.weight", "model.model.bert.mmbt.transformer.encoder.layer.3.output.LayerNorm.bias", "model.model.bert.mmbt.transformer.encoder.layer.4.attention.self.query.weight", "model.model.bert.mmbt.transformer.encoder.layer.4.attention.self.query.bias", "model.model.bert.mmbt.transformer.encoder.layer.4.attention.self.key.weight", "model.model.bert.mmbt.transformer.encoder.layer.4.attention.self.key.bias", "model.model.bert.mmbt.transformer.encoder.layer.4.attention.self.value.weight", "model.model.bert.mmbt.transformer.encoder.layer.4.attention.self.value.bias", "model.model.bert.mmbt.transformer.encoder.layer.4.attention.output.dense.weight", "model.model.bert.mmbt.transformer.encoder.layer.4.attention.output.dense.bias", "model.model.bert.mmbt.transformer.encoder.layer.4.attention.output.LayerNorm.weight", "model.model.bert.mmbt.transformer.encoder.layer.4.attention.output.LayerNorm.bias", "model.model.bert.mmbt.transformer.encoder.layer.4.intermediate.dense.weight", "model.model.bert.mmbt.transformer.encoder.layer.4.intermediate.dense.bias", "model.model.bert.mmbt.transformer.encoder.layer.4.output.dense.weight", "model.model.bert.mmbt.transformer.encoder.layer.4.output.dense.bias", "model.model.bert.mmbt.transformer.encoder.layer.4.output.LayerNorm.weight", "model.model.bert.mmbt.transformer.encoder.layer.4.output.LayerNorm.bias", "model.model.bert.mmbt.transformer.encoder.layer.5.attention.self.query.weight", "model.model.bert.mmbt.transformer.encoder.layer.5.attention.self.query.bias", "model.model.bert.mmbt.transformer.encoder.layer.5.attention.self.key.weight", "model.model.bert.mmbt.transformer.encoder.layer.5.attention.self.key.bias", "model.model.bert.mmbt.transformer.encoder.layer.5.attention.self.value.weight", "model.model.bert.mmbt.transformer.encoder.layer.5.attention.self.value.bias", "model.model.bert.mmbt.transformer.encoder.layer.5.attention.output.dense.weight", "model.model.bert.mmbt.transformer.encoder.layer.5.attention.output.dense.bias", "model.model.bert.mmbt.transformer.encoder.layer.5.attention.output.LayerNorm.weight", "model.model.bert.mmbt.transformer.encoder.layer.5.attention.output.LayerNorm.bias", "model.model.bert.mmbt.transformer.encoder.layer.5.intermediate.dense.weight", "model.model.bert.mmbt.transformer.encoder.layer.5.intermediate.dense.bias", "model.model.bert.mmbt.transformer.encoder.layer.5.output.dense.weight", "model.model.bert.mmbt.transformer.encoder.layer.5.output.dense.bias", "model.model.bert.mmbt.transformer.encoder.layer.5.output.LayerNorm.weight", "model.model.bert.mmbt.transformer.encoder.layer.5.output.LayerNorm.bias", "model.model.bert.mmbt.transformer.encoder.layer.6.attention.self.query.weight", "model.model.bert.mmbt.transformer.encoder.layer.6.attention.self.query.bias", "model.model.bert.mmbt.transformer.encoder.layer.6.attention.self.key.weight", "model.model.bert.mmbt.transformer.encoder.layer.6.attention.self.key.bias", "model.model.bert.mmbt.transformer.encoder.layer.6.attention.self.value.weight", "model.model.bert.mmbt.transformer.encoder.layer.6.attention.self.value.bias", "model.model.bert.mmbt.transformer.encoder.layer.6.attention.output.dense.weight", "model.model.bert.mmbt.transformer.encoder.layer.6.attention.output.dense.bias", "model.model.bert.mmbt.transformer.encoder.layer.6.attention.output.LayerNorm.weight", "model.model.bert.mmbt.transformer.encoder.layer.6.attention.output.LayerNorm.bias", "model.model.bert.mmbt.transformer.encoder.layer.6.intermediate.dense.weight", "model.model.bert.mmbt.transformer.encoder.layer.6.intermediate.dense.bias", "model.model.bert.mmbt.transformer.encoder.layer.6.output.dense.weight", "model.model.bert.mmbt.transformer.encoder.layer.6.output.dense.bias", "model.model.bert.mmbt.transformer.encoder.layer.6.output.LayerNorm.weight", "model.model.bert.mmbt.transformer.encoder.layer.6.output.LayerNorm.bias", "model.model.bert.mmbt.transformer.encoder.layer.7.attention.self.query.weight", "model.model.bert.mmbt.transformer.encoder.layer.7.attention.self.query.bias", "model.model.bert.mmbt.transformer.encoder.layer.7.attention.self.key.weight", "model.model.bert.mmbt.transformer.encoder.layer.7.attention.self.key.bias", "model.model.bert.mmbt.transformer.encoder.layer.7.attention.self.value.weight", "model.model.bert.mmbt.transformer.encoder.layer.7.attention.self.value.bias", "model.model.bert.mmbt.transformer.encoder.layer.7.attention.output.dense.weight", "model.model.bert.mmbt.transformer.encoder.layer.7.attention.output.dense.bias", "model.model.bert.mmbt.transformer.encoder.layer.7.attention.output.LayerNorm.weight", "model.model.bert.mmbt.transformer.encoder.layer.7.attention.output.LayerNorm.bias", "model.model.bert.mmbt.transformer.encoder.layer.7.intermediate.dense.weight", "model.model.bert.mmbt.transformer.encoder.layer.7.intermediate.dense.bias", "model.model.bert.mmbt.transformer.encoder.layer.7.output.dense.weight", "model.model.bert.mmbt.transformer.encoder.layer.7.output.dense.bias", "model.model.bert.mmbt.transformer.encoder.layer.7.output.LayerNorm.weight", "model.model.bert.mmbt.transformer.encoder.layer.7.output.LayerNorm.bias", "model.model.bert.mmbt.transformer.encoder.layer.8.attention.self.query.weight", "model.model.bert.mmbt.transformer.encoder.layer.8.attention.self.query.bias", "model.model.bert.mmbt.transformer.encoder.layer.8.attention.self.key.weight", "model.model.bert.mmbt.transformer.encoder.layer.8.attention.self.key.bias", "model.model.bert.mmbt.transformer.encoder.layer.8.attention.self.value.weight", "model.model.bert.mmbt.transformer.encoder.layer.8.attention.self.value.bias", "model.model.bert.mmbt.transformer.encoder.layer.8.attention.output.dense.weight", "model.model.bert.mmbt.transformer.encoder.layer.8.attention.output.dense.bias", "model.model.bert.mmbt.transformer.encoder.layer.8.attention.output.LayerNorm.weight", "model.model.bert.mmbt.transformer.encoder.layer.8.attention.output.LayerNorm.bias", "model.model.bert.mmbt.transformer.encoder.layer.8.intermediate.dense.weight", "model.model.bert.mmbt.transformer.encoder.layer.8.intermediate.dense.bias", "model.model.bert.mmbt.transformer.encoder.layer.8.output.dense.weight", "model.model.bert.mmbt.transformer.encoder.layer.8.output.dense.bias", "model.model.bert.mmbt.transformer.encoder.layer.8.output.LayerNorm.weight", "model.model.bert.mmbt.transformer.encoder.layer.8.output.LayerNorm.bias", "model.model.bert.mmbt.transformer.encoder.layer.9.attention.self.query.weight", "model.model.bert.mmbt.transformer.encoder.layer.9.attention.self.query.bias", "model.model.bert.mmbt.transformer.encoder.layer.9.attention.self.key.weight", "model.model.bert.mmbt.transformer.encoder.layer.9.attention.self.key.bias", "model.model.bert.mmbt.transformer.encoder.layer.9.attention.self.value.weight", "model.model.bert.mmbt.transformer.encoder.layer.9.attention.self.value.bias", "model.model.bert.mmbt.transformer.encoder.layer.9.attention.output.dense.weight", "model.model.bert.mmbt.transformer.encoder.layer.9.attention.output.dense.bias", "model.model.bert.mmbt.transformer.encoder.layer.9.attention.output.LayerNorm.weight", "model.model.bert.mmbt.transformer.encoder.layer.9.attention.output.LayerNorm.bias", "model.model.bert.mmbt.transformer.encoder.layer.9.intermediate.dense.weight", "model.model.bert.mmbt.transformer.encoder.layer.9.intermediate.dense.bias", "model.model.bert.mmbt.transformer.encoder.layer.9.output.dense.weight", "model.model.bert.mmbt.transformer.encoder.layer.9.output.dense.bias", "model.model.bert.mmbt.transformer.encoder.layer.9.output.LayerNorm.weight", "model.model.bert.mmbt.transformer.encoder.layer.9.output.LayerNorm.bias", "model.model.bert.mmbt.transformer.encoder.layer.10.attention.self.query.weight", "model.model.bert.mmbt.transformer.encoder.layer.10.attention.self.query.bias", "model.model.bert.mmbt.transformer.encoder.layer.10.attention.self.key.weight", "model.model.bert.mmbt.transformer.encoder.layer.10.attention.self.key.bias", "model.model.bert.mmbt.transformer.encoder.layer.10.attention.self.value.weight", "model.model.bert.mmbt.transformer.encoder.layer.10.attention.self.value.bias", "model.model.bert.mmbt.transformer.encoder.layer.10.attention.output.dense.weight", "model.model.bert.mmbt.transformer.encoder.layer.10.attention.output.dense.bias", "model.model.bert.mmbt.transformer.encoder.layer.10.attention.output.LayerNorm.weight", "model.model.bert.mmbt.transformer.encoder.layer.10.attention.output.LayerNorm.bias", "model.model.bert.mmbt.transformer.encoder.layer.10.intermediate.dense.weight", "model.model.bert.mmbt.transformer.encoder.layer.10.intermediate.dense.bias", "model.model.bert.mmbt.transformer.encoder.layer.10.output.dense.weight", "model.model.bert.mmbt.transformer.encoder.layer.10.output.dense.bias", "model.model.bert.mmbt.transformer.encoder.layer.10.output.LayerNorm.weight", "model.model.bert.mmbt.transformer.encoder.layer.10.output.LayerNorm.bias", "model.model.bert.mmbt.transformer.encoder.layer.11.attention.self.query.weight", "model.model.bert.mmbt.transformer.encoder.layer.11.attention.self.query.bias", "model.model.bert.mmbt.transformer.encoder.layer.11.attention.self.key.weight", "model.model.bert.mmbt.transformer.encoder.layer.11.attention.self.key.bias", "model.model.bert.mmbt.transformer.encoder.layer.11.attention.self.value.weight", "model.model.bert.mmbt.transformer.encoder.layer.11.attention.self.value.bias", "model.model.bert.mmbt.transformer.encoder.layer.11.attention.output.dense.weight", "model.model.bert.mmbt.transformer.encoder.layer.11.attention.output.dense.bias", "model.model.bert.mmbt.transformer.encoder.layer.11.attention.output.LayerNorm.weight", "model.model.bert.mmbt.transformer.encoder.layer.11.attention.output.LayerNorm.bias", "model.model.bert.mmbt.transformer.encoder.layer.11.intermediate.dense.weight", "model.model.bert.mmbt.transformer.encoder.layer.11.intermediate.dense.bias", "model.model.bert.mmbt.transformer.encoder.layer.11.output.dense.weight", "model.model.bert.mmbt.transformer.encoder.layer.11.output.dense.bias", "model.model.bert.mmbt.transformer.encoder.layer.11.output.LayerNorm.weight", "model.model.bert.mmbt.transformer.encoder.layer.11.output.LayerNorm.bias", "model.model.bert.mmbt.transformer.pooler.dense.weight", "model.model.bert.mmbt.transformer.pooler.dense.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.0.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.4.0.conv1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.4.0.bn1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.4.0.bn1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.4.0.bn1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.4.0.bn1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.4.0.conv2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.4.0.bn2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.4.0.bn2.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.4.0.bn2.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.4.0.bn2.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.4.0.conv3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.4.0.bn3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.4.0.bn3.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.4.0.bn3.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.4.0.bn3.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.4.0.downsample.0.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.4.0.downsample.1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.4.0.downsample.1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.4.0.downsample.1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.4.0.downsample.1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.4.1.conv1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.4.1.bn1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.4.1.bn1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.4.1.bn1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.4.1.bn1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.4.1.conv2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.4.1.bn2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.4.1.bn2.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.4.1.bn2.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.4.1.bn2.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.4.1.conv3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.4.1.bn3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.4.1.bn3.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.4.1.bn3.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.4.1.bn3.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.4.2.conv1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.4.2.bn1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.4.2.bn1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.4.2.bn1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.4.2.bn1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.4.2.conv2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.4.2.bn2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.4.2.bn2.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.4.2.bn2.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.4.2.bn2.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.4.2.conv3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.4.2.bn3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.4.2.bn3.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.4.2.bn3.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.4.2.bn3.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.5.0.conv1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.5.0.bn1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.5.0.bn1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.5.0.bn1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.5.0.bn1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.5.0.conv2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.5.0.bn2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.5.0.bn2.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.5.0.bn2.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.5.0.bn2.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.5.0.conv3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.5.0.bn3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.5.0.bn3.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.5.0.bn3.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.5.0.bn3.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.5.0.downsample.0.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.5.0.downsample.1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.5.0.downsample.1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.5.0.downsample.1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.5.0.downsample.1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.5.1.conv1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.5.1.bn1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.5.1.bn1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.5.1.bn1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.5.1.bn1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.5.1.conv2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.5.1.bn2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.5.1.bn2.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.5.1.bn2.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.5.1.bn2.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.5.1.conv3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.5.1.bn3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.5.1.bn3.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.5.1.bn3.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.5.1.bn3.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.5.2.conv1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.5.2.bn1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.5.2.bn1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.5.2.bn1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.5.2.bn1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.5.2.conv2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.5.2.bn2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.5.2.bn2.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.5.2.bn2.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.5.2.bn2.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.5.2.conv3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.5.2.bn3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.5.2.bn3.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.5.2.bn3.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.5.2.bn3.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.5.3.conv1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.5.3.bn1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.5.3.bn1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.5.3.bn1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.5.3.bn1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.5.3.conv2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.5.3.bn2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.5.3.bn2.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.5.3.bn2.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.5.3.bn2.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.5.3.conv3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.5.3.bn3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.5.3.bn3.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.5.3.bn3.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.5.3.bn3.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.5.4.conv1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.5.4.bn1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.5.4.bn1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.5.4.bn1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.5.4.bn1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.5.4.conv2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.5.4.bn2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.5.4.bn2.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.5.4.bn2.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.5.4.bn2.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.5.4.conv3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.5.4.bn3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.5.4.bn3.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.5.4.bn3.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.5.4.bn3.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.5.5.conv1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.5.5.bn1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.5.5.bn1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.5.5.bn1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.5.5.bn1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.5.5.conv2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.5.5.bn2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.5.5.bn2.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.5.5.bn2.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.5.5.bn2.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.5.5.conv3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.5.5.bn3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.5.5.bn3.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.5.5.bn3.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.5.5.bn3.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.5.6.conv1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.5.6.bn1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.5.6.bn1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.5.6.bn1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.5.6.bn1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.5.6.conv2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.5.6.bn2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.5.6.bn2.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.5.6.bn2.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.5.6.bn2.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.5.6.conv3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.5.6.bn3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.5.6.bn3.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.5.6.bn3.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.5.6.bn3.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.5.7.conv1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.5.7.bn1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.5.7.bn1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.5.7.bn1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.5.7.bn1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.5.7.conv2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.5.7.bn2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.5.7.bn2.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.5.7.bn2.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.5.7.bn2.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.5.7.conv3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.5.7.bn3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.5.7.bn3.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.5.7.bn3.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.5.7.bn3.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.0.conv1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.0.bn1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.0.bn1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.0.bn1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.0.bn1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.0.conv2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.0.bn2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.0.bn2.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.0.bn2.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.0.bn2.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.0.conv3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.0.bn3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.0.bn3.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.0.bn3.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.0.bn3.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.0.downsample.0.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.0.downsample.1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.0.downsample.1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.0.downsample.1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.0.downsample.1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.1.conv1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.1.bn1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.1.bn1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.1.bn1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.1.bn1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.1.conv2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.1.bn2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.1.bn2.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.1.bn2.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.1.bn2.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.1.conv3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.1.bn3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.1.bn3.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.1.bn3.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.1.bn3.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.2.conv1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.2.bn1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.2.bn1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.2.bn1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.2.bn1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.2.conv2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.2.bn2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.2.bn2.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.2.bn2.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.2.bn2.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.2.conv3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.2.bn3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.2.bn3.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.2.bn3.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.2.bn3.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.3.conv1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.3.bn1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.3.bn1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.3.bn1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.3.bn1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.3.conv2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.3.bn2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.3.bn2.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.3.bn2.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.3.bn2.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.3.conv3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.3.bn3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.3.bn3.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.3.bn3.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.3.bn3.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.4.conv1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.4.bn1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.4.bn1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.4.bn1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.4.bn1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.4.conv2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.4.bn2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.4.bn2.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.4.bn2.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.4.bn2.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.4.conv3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.4.bn3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.4.bn3.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.4.bn3.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.4.bn3.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.5.conv1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.5.bn1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.5.bn1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.5.bn1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.5.bn1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.5.conv2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.5.bn2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.5.bn2.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.5.bn2.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.5.bn2.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.5.conv3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.5.bn3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.5.bn3.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.5.bn3.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.5.bn3.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.6.conv1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.6.bn1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.6.bn1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.6.bn1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.6.bn1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.6.conv2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.6.bn2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.6.bn2.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.6.bn2.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.6.bn2.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.6.conv3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.6.bn3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.6.bn3.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.6.bn3.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.6.bn3.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.7.conv1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.7.bn1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.7.bn1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.7.bn1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.7.bn1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.7.conv2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.7.bn2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.7.bn2.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.7.bn2.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.7.bn2.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.7.conv3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.7.bn3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.7.bn3.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.7.bn3.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.7.bn3.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.8.conv1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.8.bn1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.8.bn1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.8.bn1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.8.bn1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.8.conv2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.8.bn2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.8.bn2.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.8.bn2.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.8.bn2.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.8.conv3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.8.bn3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.8.bn3.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.8.bn3.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.8.bn3.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.9.conv1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.9.bn1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.9.bn1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.9.bn1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.9.bn1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.9.conv2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.9.bn2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.9.bn2.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.9.bn2.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.9.bn2.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.9.conv3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.9.bn3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.9.bn3.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.9.bn3.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.9.bn3.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.10.conv1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.10.bn1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.10.bn1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.10.bn1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.10.bn1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.10.conv2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.10.bn2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.10.bn2.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.10.bn2.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.10.bn2.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.10.conv3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.10.bn3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.10.bn3.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.10.bn3.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.10.bn3.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.11.conv1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.11.bn1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.11.bn1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.11.bn1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.11.bn1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.11.conv2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.11.bn2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.11.bn2.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.11.bn2.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.11.bn2.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.11.conv3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.11.bn3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.11.bn3.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.11.bn3.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.11.bn3.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.12.conv1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.12.bn1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.12.bn1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.12.bn1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.12.bn1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.12.conv2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.12.bn2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.12.bn2.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.12.bn2.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.12.bn2.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.12.conv3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.12.bn3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.12.bn3.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.12.bn3.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.12.bn3.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.13.conv1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.13.bn1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.13.bn1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.13.bn1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.13.bn1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.13.conv2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.13.bn2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.13.bn2.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.13.bn2.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.13.bn2.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.13.conv3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.13.bn3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.13.bn3.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.13.bn3.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.13.bn3.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.14.conv1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.14.bn1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.14.bn1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.14.bn1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.14.bn1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.14.conv2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.14.bn2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.14.bn2.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.14.bn2.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.14.bn2.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.14.conv3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.14.bn3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.14.bn3.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.14.bn3.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.14.bn3.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.15.conv1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.15.bn1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.15.bn1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.15.bn1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.15.bn1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.15.conv2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.15.bn2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.15.bn2.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.15.bn2.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.15.bn2.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.15.conv3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.15.bn3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.15.bn3.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.15.bn3.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.15.bn3.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.16.conv1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.16.bn1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.16.bn1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.16.bn1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.16.bn1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.16.conv2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.16.bn2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.16.bn2.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.16.bn2.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.16.bn2.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.16.conv3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.16.bn3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.16.bn3.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.16.bn3.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.16.bn3.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.17.conv1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.17.bn1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.17.bn1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.17.bn1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.17.bn1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.17.conv2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.17.bn2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.17.bn2.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.17.bn2.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.17.bn2.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.17.conv3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.17.bn3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.17.bn3.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.17.bn3.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.17.bn3.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.18.conv1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.18.bn1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.18.bn1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.18.bn1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.18.bn1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.18.conv2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.18.bn2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.18.bn2.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.18.bn2.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.18.bn2.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.18.conv3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.18.bn3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.18.bn3.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.18.bn3.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.18.bn3.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.19.conv1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.19.bn1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.19.bn1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.19.bn1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.19.bn1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.19.conv2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.19.bn2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.19.bn2.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.19.bn2.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.19.bn2.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.19.conv3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.19.bn3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.19.bn3.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.19.bn3.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.19.bn3.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.20.conv1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.20.bn1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.20.bn1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.20.bn1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.20.bn1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.20.conv2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.20.bn2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.20.bn2.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.20.bn2.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.20.bn2.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.20.conv3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.20.bn3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.20.bn3.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.20.bn3.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.20.bn3.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.21.conv1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.21.bn1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.21.bn1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.21.bn1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.21.bn1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.21.conv2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.21.bn2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.21.bn2.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.21.bn2.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.21.bn2.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.21.conv3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.21.bn3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.21.bn3.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.21.bn3.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.21.bn3.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.22.conv1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.22.bn1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.22.bn1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.22.bn1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.22.bn1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.22.conv2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.22.bn2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.22.bn2.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.22.bn2.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.22.bn2.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.22.conv3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.22.bn3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.22.bn3.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.22.bn3.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.22.bn3.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.23.conv1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.23.bn1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.23.bn1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.23.bn1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.23.bn1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.23.conv2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.23.bn2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.23.bn2.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.23.bn2.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.23.bn2.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.23.conv3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.23.bn3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.23.bn3.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.23.bn3.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.23.bn3.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.24.conv1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.24.bn1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.24.bn1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.24.bn1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.24.bn1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.24.conv2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.24.bn2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.24.bn2.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.24.bn2.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.24.bn2.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.24.conv3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.24.bn3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.24.bn3.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.24.bn3.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.24.bn3.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.25.conv1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.25.bn1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.25.bn1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.25.bn1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.25.bn1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.25.conv2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.25.bn2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.25.bn2.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.25.bn2.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.25.bn2.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.25.conv3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.25.bn3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.25.bn3.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.25.bn3.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.25.bn3.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.26.conv1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.26.bn1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.26.bn1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.26.bn1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.26.bn1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.26.conv2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.26.bn2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.26.bn2.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.26.bn2.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.26.bn2.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.26.conv3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.26.bn3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.26.bn3.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.26.bn3.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.26.bn3.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.27.conv1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.27.bn1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.27.bn1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.27.bn1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.27.bn1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.27.conv2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.27.bn2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.27.bn2.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.27.bn2.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.27.bn2.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.27.conv3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.27.bn3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.27.bn3.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.27.bn3.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.27.bn3.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.28.conv1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.28.bn1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.28.bn1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.28.bn1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.28.bn1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.28.conv2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.28.bn2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.28.bn2.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.28.bn2.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.28.bn2.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.28.conv3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.28.bn3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.28.bn3.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.28.bn3.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.28.bn3.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.29.conv1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.29.bn1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.29.bn1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.29.bn1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.29.bn1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.29.conv2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.29.bn2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.29.bn2.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.29.bn2.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.29.bn2.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.29.conv3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.29.bn3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.29.bn3.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.29.bn3.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.29.bn3.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.30.conv1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.30.bn1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.30.bn1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.30.bn1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.30.bn1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.30.conv2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.30.bn2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.30.bn2.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.30.bn2.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.30.bn2.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.30.conv3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.30.bn3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.30.bn3.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.30.bn3.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.30.bn3.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.31.conv1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.31.bn1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.31.bn1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.31.bn1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.31.bn1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.31.conv2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.31.bn2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.31.bn2.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.31.bn2.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.31.bn2.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.31.conv3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.31.bn3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.31.bn3.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.31.bn3.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.31.bn3.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.32.conv1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.32.bn1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.32.bn1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.32.bn1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.32.bn1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.32.conv2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.32.bn2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.32.bn2.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.32.bn2.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.32.bn2.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.32.conv3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.32.bn3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.32.bn3.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.32.bn3.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.32.bn3.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.33.conv1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.33.bn1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.33.bn1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.33.bn1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.33.bn1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.33.conv2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.33.bn2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.33.bn2.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.33.bn2.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.33.bn2.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.33.conv3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.33.bn3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.33.bn3.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.33.bn3.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.33.bn3.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.34.conv1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.34.bn1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.34.bn1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.34.bn1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.34.bn1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.34.conv2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.34.bn2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.34.bn2.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.34.bn2.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.34.bn2.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.34.conv3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.34.bn3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.34.bn3.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.34.bn3.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.34.bn3.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.35.conv1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.35.bn1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.35.bn1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.35.bn1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.35.bn1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.35.conv2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.35.bn2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.35.bn2.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.35.bn2.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.35.bn2.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.6.35.conv3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.35.bn3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.6.35.bn3.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.6.35.bn3.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.6.35.bn3.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.7.0.conv1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.7.0.bn1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.7.0.bn1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.7.0.bn1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.7.0.bn1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.7.0.conv2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.7.0.bn2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.7.0.bn2.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.7.0.bn2.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.7.0.bn2.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.7.0.conv3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.7.0.bn3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.7.0.bn3.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.7.0.bn3.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.7.0.bn3.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.7.0.downsample.0.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.7.0.downsample.1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.7.0.downsample.1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.7.0.downsample.1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.7.0.downsample.1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.7.1.conv1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.7.1.bn1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.7.1.bn1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.7.1.bn1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.7.1.bn1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.7.1.conv2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.7.1.bn2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.7.1.bn2.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.7.1.bn2.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.7.1.bn2.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.7.1.conv3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.7.1.bn3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.7.1.bn3.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.7.1.bn3.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.7.1.bn3.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.7.2.conv1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.7.2.bn1.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.7.2.bn1.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.7.2.bn1.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.7.2.bn1.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.7.2.conv2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.7.2.bn2.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.7.2.bn2.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.7.2.bn2.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.7.2.bn2.running_var", "model.model.bert.mmbt.modal_encoder.encoder.model.7.2.conv3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.7.2.bn3.weight", "model.model.bert.mmbt.modal_encoder.encoder.model.7.2.bn3.bias", "model.model.bert.mmbt.modal_encoder.encoder.model.7.2.bn3.running_mean", "model.model.bert.mmbt.modal_encoder.encoder.model.7.2.bn3.running_var", "model.model.bert.mmbt.modal_encoder.proj_embeddings.weight", "model.model.bert.mmbt.modal_encoder.proj_embeddings.bias", "model.model.bert.mmbt.modal_encoder.position_embeddings.weight", "model.model.bert.mmbt.modal_encoder.token_type_embeddings.weight", "model.model.bert.mmbt.modal_encoder.word_embeddings.weight", "model.model.bert.mmbt.modal_encoder.LayerNorm.weight", "model.model.bert.mmbt.modal_encoder.LayerNorm.bias", "model.model.classifier.0.dense.weight", "model.model.classifier.0.dense.bias", "model.model.classifier.0.LayerNorm.weight", "model.model.classifier.0.LayerNorm.bias", "model.model.classifier.1.weight", "model.model.classifier.1.bias".
Unexpected key(s) in state_dict: "model.bert.mmbt.transformer.embeddings.position_ids", "model.bert.mmbt.transformer.embeddings.word_embeddings.weight", "model.bert.mmbt.transformer.embeddings.position_embeddings.weight", "model.bert.mmbt.transformer.embeddings.token_type_embeddings.weight", "model.bert.mmbt.transformer.embeddings.LayerNorm.weight", "model.bert.mmbt.transformer.embeddings.LayerNorm.bias", "model.bert.mmbt.transformer.encoder.layer.0.attention.self.query.weight", "model.bert.mmbt.transformer.encoder.layer.0.attention.self.query.bias", "model.bert.mmbt.transformer.encoder.layer.0.attention.self.key.weight", "model.bert.mmbt.transformer.encoder.layer.0.attention.self.key.bias", "model.bert.mmbt.transformer.encoder.layer.0.attention.self.value.weight", "model.bert.mmbt.transformer.encoder.layer.0.attention.self.value.bias", "model.bert.mmbt.transformer.encoder.layer.0.attention.output.dense.weight", "model.bert.mmbt.transformer.encoder.layer.0.attention.output.dense.bias", "model.bert.mmbt.transformer.encoder.layer.0.attention.output.LayerNorm.weight", "model.bert.mmbt.transformer.encoder.layer.0.attention.output.LayerNorm.bias", "model.bert.mmbt.transformer.encoder.layer.0.intermediate.dense.weight", "model.bert.mmbt.transformer.encoder.layer.0.intermediate.dense.bias", "model.bert.mmbt.transformer.encoder.layer.0.output.dense.weight", "model.bert.mmbt.transformer.encoder.layer.0.output.dense.bias", "model.bert.mmbt.transformer.encoder.layer.0.output.LayerNorm.weight", "model.bert.mmbt.transformer.encoder.layer.0.output.LayerNorm.bias", "model.bert.mmbt.transformer.encoder.layer.1.attention.self.query.weight", "model.bert.mmbt.transformer.encoder.layer.1.attention.self.query.bias", "model.bert.mmbt.transformer.encoder.layer.1.attention.self.key.weight", "model.bert.mmbt.transformer.encoder.layer.1.attention.self.key.bias", "model.bert.mmbt.transformer.encoder.layer.1.attention.self.value.weight", "model.bert.mmbt.transformer.encoder.layer.1.attention.self.value.bias", "model.bert.mmbt.transformer.encoder.layer.1.attention.output.dense.weight", "model.bert.mmbt.transformer.encoder.layer.1.attention.output.dense.bias", "model.bert.mmbt.transformer.encoder.layer.1.attention.output.LayerNorm.weight", "model.bert.mmbt.transformer.encoder.layer.1.attention.output.LayerNorm.bias", "model.bert.mmbt.transformer.encoder.layer.1.intermediate.dense.weight", "model.bert.mmbt.transformer.encoder.layer.1.intermediate.dense.bias", "model.bert.mmbt.transformer.encoder.layer.1.output.dense.weight", "model.bert.mmbt.transformer.encoder.layer.1.output.dense.bias", "model.bert.mmbt.transformer.encoder.layer.1.output.LayerNorm.weight", "model.bert.mmbt.transformer.encoder.layer.1.output.LayerNorm.bias", "model.bert.mmbt.transformer.encoder.layer.2.attention.self.query.weight", "model.bert.mmbt.transformer.encoder.layer.2.attention.self.query.bias", "model.bert.mmbt.transformer.encoder.layer.2.attention.self.key.weight", "model.bert.mmbt.transformer.encoder.layer.2.attention.self.key.bias", "model.bert.mmbt.transformer.encoder.layer.2.attention.self.value.weight", "model.bert.mmbt.transformer.encoder.layer.2.attention.self.value.bias", "model.bert.mmbt.transformer.encoder.layer.2.attention.output.dense.weight", "model.bert.mmbt.transformer.encoder.layer.2.attention.output.dense.bias", "model.bert.mmbt.transformer.encoder.layer.2.attention.output.LayerNorm.weight", "model.bert.mmbt.transformer.encoder.layer.2.attention.output.LayerNorm.bias", "model.bert.mmbt.transformer.encoder.layer.2.intermediate.dense.weight", "model.bert.mmbt.transformer.encoder.layer.2.intermediate.dense.bias", "model.bert.mmbt.transformer.encoder.layer.2.output.dense.weight", "model.bert.mmbt.transformer.encoder.layer.2.output.dense.bias", "model.bert.mmbt.transformer.encoder.layer.2.output.LayerNorm.weight", "model.bert.mmbt.transformer.encoder.layer.2.output.LayerNorm.bias", "model.bert.mmbt.transformer.encoder.layer.3.attention.self.query.weight", "model.bert.mmbt.transformer.encoder.layer.3.attention.self.query.bias", "model.bert.mmbt.transformer.encoder.layer.3.attention.self.key.weight", "model.bert.mmbt.transformer.encoder.layer.3.attention.self.key.bias", "model.bert.mmbt.transformer.encoder.layer.3.attention.self.value.weight", "model.bert.mmbt.transformer.encoder.layer.3.attention.self.value.bias", "model.bert.mmbt.transformer.encoder.layer.3.attention.output.dense.weight", "model.bert.mmbt.transformer.encoder.layer.3.attention.output.dense.bias", "model.bert.mmbt.transformer.encoder.layer.3.attention.output.LayerNorm.weight", "model.bert.mmbt.transformer.encoder.layer.3.attention.output.LayerNorm.bias", "model.bert.mmbt.transformer.encoder.layer.3.intermediate.dense.weight", "model.bert.mmbt.transformer.encoder.layer.3.intermediate.dense.bias", "model.bert.mmbt.transformer.encoder.layer.3.output.dense.weight", "model.bert.mmbt.transformer.encoder.layer.3.output.dense.bias", "model.bert.mmbt.transformer.encoder.layer.3.output.LayerNorm.weight", "model.bert.mmbt.transformer.encoder.layer.3.output.LayerNorm.bias", "model.bert.mmbt.transformer.encoder.layer.4.attention.self.query.weight", "model.bert.mmbt.transformer.encoder.layer.4.attention.self.query.bias", "model.bert.mmbt.transformer.encoder.layer.4.attention.self.key.weight", "model.bert.mmbt.transformer.encoder.layer.4.attention.self.key.bias", "model.bert.mmbt.transformer.encoder.layer.4.attention.self.value.weight", "model.bert.mmbt.transformer.encoder.layer.4.attention.self.value.bias", "model.bert.mmbt.transformer.encoder.layer.4.attention.output.dense.weight", "model.bert.mmbt.transformer.encoder.layer.4.attention.output.dense.bias", "model.bert.mmbt.transformer.encoder.layer.4.attention.output.LayerNorm.weight", "model.bert.mmbt.transformer.encoder.layer.4.attention.output.LayerNorm.bias", "model.bert.mmbt.transformer.encoder.layer.4.intermediate.dense.weight", "model.bert.mmbt.transformer.encoder.layer.4.intermediate.dense.bias", "model.bert.mmbt.transformer.encoder.layer.4.output.dense.weight", "model.bert.mmbt.transformer.encoder.layer.4.output.dense.bias", "model.bert.mmbt.transformer.encoder.layer.4.output.LayerNorm.weight", "model.bert.mmbt.transformer.encoder.layer.4.output.LayerNorm.bias", "model.bert.mmbt.transformer.encoder.layer.5.attention.self.query.weight", "model.bert.mmbt.transformer.encoder.layer.5.attention.self.query.bias", "model.bert.mmbt.transformer.encoder.layer.5.attention.self.key.weight", "model.bert.mmbt.transformer.encoder.layer.5.attention.self.key.bias", "model.bert.mmbt.transformer.encoder.layer.5.attention.self.value.weight", "model.bert.mmbt.transformer.encoder.layer.5.attention.self.value.bias", "model.bert.mmbt.transformer.encoder.layer.5.attention.output.dense.weight", "model.bert.mmbt.transformer.encoder.layer.5.attention.output.dense.bias", "model.bert.mmbt.transformer.encoder.layer.5.attention.output.LayerNorm.weight", "model.bert.mmbt.transformer.encoder.layer.5.attention.output.LayerNorm.bias", "model.bert.mmbt.transformer.encoder.layer.5.intermediate.dense.weight", "model.bert.mmbt.transformer.encoder.layer.5.intermediate.dense.bias", "model.bert.mmbt.transformer.encoder.layer.5.output.dense.weight", "model.bert.mmbt.transformer.encoder.layer.5.output.dense.bias", "model.bert.mmbt.transformer.encoder.layer.5.output.LayerNorm.weight", "model.bert.mmbt.transformer.encoder.layer.5.output.LayerNorm.bias", "model.bert.mmbt.transformer.encoder.layer.6.attention.self.query.weight", "model.bert.mmbt.transformer.encoder.layer.6.attention.self.query.bias", "model.bert.mmbt.transformer.encoder.layer.6.attention.self.key.weight", "model.bert.mmbt.transformer.encoder.layer.6.attention.self.key.bias", "model.bert.mmbt.transformer.encoder.layer.6.attention.self.value.weight", "model.bert.mmbt.transformer.encoder.layer.6.attention.self.value.bias", "model.bert.mmbt.transformer.encoder.layer.6.attention.output.dense.weight", "model.bert.mmbt.transformer.encoder.layer.6.attention.output.dense.bias", "model.bert.mmbt.transformer.encoder.layer.6.attention.output.LayerNorm.weight", "model.bert.mmbt.transformer.encoder.layer.6.attention.output.LayerNorm.bias", "model.bert.mmbt.transformer.encoder.layer.6.intermediate.dense.weight", "model.bert.mmbt.transformer.encoder.layer.6.intermediate.dense.bias", "model.bert.mmbt.transformer.encoder.layer.6.output.dense.weight", "model.bert.mmbt.transformer.encoder.layer.6.output.dense.bias", "model.bert.mmbt.transformer.encoder.layer.6.output.LayerNorm.weight", "model.bert.mmbt.transformer.encoder.layer.6.output.LayerNorm.bias", "model.bert.mmbt.transformer.encoder.layer.7.attention.self.query.weight", "model.bert.mmbt.transformer.encoder.layer.7.attention.self.query.bias", "model.bert.mmbt.transformer.encoder.layer.7.attention.self.key.weight", "model.bert.mmbt.transformer.encoder.layer.7.attention.self.key.bias", "model.bert.mmbt.transformer.encoder.layer.7.attention.self.value.weight", "model.bert.mmbt.transformer.encoder.layer.7.attention.self.value.bias", "model.bert.mmbt.transformer.encoder.layer.7.attention.output.dense.weight", "model.bert.mmbt.transformer.encoder.layer.7.attention.output.dense.bias", "model.bert.mmbt.transformer.encoder.layer.7.attention.output.LayerNorm.weight", "model.bert.mmbt.transformer.encoder.layer.7.attention.output.LayerNorm.bias", "model.bert.mmbt.transformer.encoder.layer.7.intermediate.dense.weight", "model.bert.mmbt.transformer.encoder.layer.7.intermediate.dense.bias", "model.bert.mmbt.transformer.encoder.layer.7.output.dense.weight", "model.bert.mmbt.transformer.encoder.layer.7.output.dense.bias", "model.bert.mmbt.transformer.encoder.layer.7.output.LayerNorm.weight", "model.bert.mmbt.transformer.encoder.layer.7.output.LayerNorm.bias", "model.bert.mmbt.transformer.encoder.layer.8.attention.self.query.weight", "model.bert.mmbt.transformer.encoder.layer.8.attention.self.query.bias", "model.bert.mmbt.transformer.encoder.layer.8.attention.self.key.weight", "model.bert.mmbt.transformer.encoder.layer.8.attention.self.key.bias", "model.bert.mmbt.transformer.encoder.layer.8.attention.self.value.weight", "model.bert.mmbt.transformer.encoder.layer.8.attention.self.value.bias", "model.bert.mmbt.transformer.encoder.layer.8.attention.output.dense.weight", "model.bert.mmbt.transformer.encoder.layer.8.attention.output.dense.bias", "model.bert.mmbt.transformer.encoder.layer.8.attention.output.LayerNorm.weight", "model.bert.mmbt.transformer.encoder.layer.8.attention.output.LayerNorm.bias", "model.bert.mmbt.transformer.encoder.layer.8.intermediate.dense.weight", "model.bert.mmbt.transformer.encoder.layer.8.intermediate.dense.bias", "model.bert.mmbt.transformer.encoder.layer.8.output.dense.weight", "model.bert.mmbt.transformer.encoder.layer.8.output.dense.bias", "model.bert.mmbt.transformer.encoder.layer.8.output.LayerNorm.weight", "model.bert.mmbt.transformer.encoder.layer.8.output.LayerNorm.bias", "model.bert.mmbt.transformer.encoder.layer.9.attention.self.query.weight", "model.bert.mmbt.transformer.encoder.layer.9.attention.self.query.bias", "model.bert.mmbt.transformer.encoder.layer.9.attention.self.key.weight", "model.bert.mmbt.transformer.encoder.layer.9.attention.self.key.bias", "model.bert.mmbt.transformer.encoder.layer.9.attention.self.value.weight", "model.bert.mmbt.transformer.encoder.layer.9.attention.self.value.bias", "model.bert.mmbt.transformer.encoder.layer.9.attention.output.dense.weight", "model.bert.mmbt.transformer.encoder.layer.9.attention.output.dense.bias", "model.bert.mmbt.transformer.encoder.layer.9.attention.output.LayerNorm.weight", "model.bert.mmbt.transformer.encoder.layer.9.attention.output.LayerNorm.bias", "model.bert.mmbt.transformer.encoder.layer.9.intermediate.dense.weight", "model.bert.mmbt.transformer.encoder.layer.9.intermediate.dense.bias", "model.bert.mmbt.transformer.encoder.layer.9.output.dense.weight", "model.bert.mmbt.transformer.encoder.layer.9.output.dense.bias", "model.bert.mmbt.transformer.encoder.layer.9.output.LayerNorm.weight", "model.bert.mmbt.transformer.encoder.layer.9.output.LayerNorm.bias", "model.bert.mmbt.transformer.encoder.layer.10.attention.self.query.weight", "model.bert.mmbt.transformer.encoder.layer.10.attention.self.query.bias", "model.bert.mmbt.transformer.encoder.layer.10.attention.self.key.weight", "model.bert.mmbt.transformer.encoder.layer.10.attention.self.key.bias", "model.bert.mmbt.transformer.encoder.layer.10.attention.self.value.weight", "model.bert.mmbt.transformer.encoder.layer.10.attention.self.value.bias", "model.bert.mmbt.transformer.encoder.layer.10.attention.output.dense.weight", "model.bert.mmbt.transformer.encoder.layer.10.attention.output.dense.bias", "model.bert.mmbt.transformer.encoder.layer.10.attention.output.LayerNorm.weight", "model.bert.mmbt.transformer.encoder.layer.10.attention.output.LayerNorm.bias", "model.bert.mmbt.transformer.encoder.layer.10.intermediate.dense.weight", "model.bert.mmbt.transformer.encoder.layer.10.intermediate.dense.bias", "model.bert.mmbt.transformer.encoder.layer.10.output.dense.weight", "model.bert.mmbt.transformer.encoder.layer.10.output.dense.bias", "model.bert.mmbt.transformer.encoder.layer.10.output.LayerNorm.weight", "model.bert.mmbt.transformer.encoder.layer.10.output.LayerNorm.bias", "model.bert.mmbt.transformer.encoder.layer.11.attention.self.query.weight", "model.bert.mmbt.transformer.encoder.layer.11.attention.self.query.bias", "model.bert.mmbt.transformer.encoder.layer.11.attention.self.key.weight", "model.bert.mmbt.transformer.encoder.layer.11.attention.self.key.bias", "model.bert.mmbt.transformer.encoder.layer.11.attention.self.value.weight", "model.bert.mmbt.transformer.encoder.layer.11.attention.self.value.bias", "model.bert.mmbt.transformer.encoder.layer.11.attention.output.dense.weight", "model.bert.mmbt.transformer.encoder.layer.11.attention.output.dense.bias", "model.bert.mmbt.transformer.encoder.layer.11.attention.output.LayerNorm.weight", "model.bert.mmbt.transformer.encoder.layer.11.attention.output.LayerNorm.bias", "model.bert.mmbt.transformer.encoder.layer.11.intermediate.dense.weight", "model.bert.mmbt.transformer.encoder.layer.11.intermediate.dense.bias", "model.bert.mmbt.transformer.encoder.layer.11.output.dense.weight", "model.bert.mmbt.transformer.encoder.layer.11.output.dense.bias", "model.bert.mmbt.transformer.encoder.layer.11.output.LayerNorm.weight", "model.bert.mmbt.transformer.encoder.layer.11.output.LayerNorm.bias", "model.bert.mmbt.transformer.pooler.dense.weight", "model.bert.mmbt.transformer.pooler.dense.bias", "model.bert.mmbt.modal_encoder.encoder.model.0.weight", "model.bert.mmbt.modal_encoder.encoder.model.1.weight", "model.bert.mmbt.modal_encoder.encoder.model.1.bias", "model.bert.mmbt.modal_encoder.encoder.model.1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.4.0.conv1.weight", "model.bert.mmbt.modal_encoder.encoder.model.4.0.bn1.weight", "model.bert.mmbt.modal_encoder.encoder.model.4.0.bn1.bias", "model.bert.mmbt.modal_encoder.encoder.model.4.0.bn1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.4.0.bn1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.4.0.bn1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.4.0.conv2.weight", "model.bert.mmbt.modal_encoder.encoder.model.4.0.bn2.weight", "model.bert.mmbt.modal_encoder.encoder.model.4.0.bn2.bias", "model.bert.mmbt.modal_encoder.encoder.model.4.0.bn2.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.4.0.bn2.running_var", "model.bert.mmbt.modal_encoder.encoder.model.4.0.bn2.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.4.0.conv3.weight", "model.bert.mmbt.modal_encoder.encoder.model.4.0.bn3.weight", "model.bert.mmbt.modal_encoder.encoder.model.4.0.bn3.bias", "model.bert.mmbt.modal_encoder.encoder.model.4.0.bn3.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.4.0.bn3.running_var", "model.bert.mmbt.modal_encoder.encoder.model.4.0.bn3.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.4.0.downsample.0.weight", "model.bert.mmbt.modal_encoder.encoder.model.4.0.downsample.1.weight", "model.bert.mmbt.modal_encoder.encoder.model.4.0.downsample.1.bias", "model.bert.mmbt.modal_encoder.encoder.model.4.0.downsample.1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.4.0.downsample.1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.4.0.downsample.1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.4.1.conv1.weight", "model.bert.mmbt.modal_encoder.encoder.model.4.1.bn1.weight", "model.bert.mmbt.modal_encoder.encoder.model.4.1.bn1.bias", "model.bert.mmbt.modal_encoder.encoder.model.4.1.bn1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.4.1.bn1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.4.1.bn1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.4.1.conv2.weight", "model.bert.mmbt.modal_encoder.encoder.model.4.1.bn2.weight", "model.bert.mmbt.modal_encoder.encoder.model.4.1.bn2.bias", "model.bert.mmbt.modal_encoder.encoder.model.4.1.bn2.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.4.1.bn2.running_var", "model.bert.mmbt.modal_encoder.encoder.model.4.1.bn2.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.4.1.conv3.weight", "model.bert.mmbt.modal_encoder.encoder.model.4.1.bn3.weight", "model.bert.mmbt.modal_encoder.encoder.model.4.1.bn3.bias", "model.bert.mmbt.modal_encoder.encoder.model.4.1.bn3.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.4.1.bn3.running_var", "model.bert.mmbt.modal_encoder.encoder.model.4.1.bn3.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.4.2.conv1.weight", "model.bert.mmbt.modal_encoder.encoder.model.4.2.bn1.weight", "model.bert.mmbt.modal_encoder.encoder.model.4.2.bn1.bias", "model.bert.mmbt.modal_encoder.encoder.model.4.2.bn1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.4.2.bn1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.4.2.bn1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.4.2.conv2.weight", "model.bert.mmbt.modal_encoder.encoder.model.4.2.bn2.weight", "model.bert.mmbt.modal_encoder.encoder.model.4.2.bn2.bias", "model.bert.mmbt.modal_encoder.encoder.model.4.2.bn2.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.4.2.bn2.running_var", "model.bert.mmbt.modal_encoder.encoder.model.4.2.bn2.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.4.2.conv3.weight", "model.bert.mmbt.modal_encoder.encoder.model.4.2.bn3.weight", "model.bert.mmbt.modal_encoder.encoder.model.4.2.bn3.bias", "model.bert.mmbt.modal_encoder.encoder.model.4.2.bn3.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.4.2.bn3.running_var", "model.bert.mmbt.modal_encoder.encoder.model.4.2.bn3.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.5.0.conv1.weight", "model.bert.mmbt.modal_encoder.encoder.model.5.0.bn1.weight", "model.bert.mmbt.modal_encoder.encoder.model.5.0.bn1.bias", "model.bert.mmbt.modal_encoder.encoder.model.5.0.bn1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.5.0.bn1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.5.0.bn1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.5.0.conv2.weight", "model.bert.mmbt.modal_encoder.encoder.model.5.0.bn2.weight", "model.bert.mmbt.modal_encoder.encoder.model.5.0.bn2.bias", "model.bert.mmbt.modal_encoder.encoder.model.5.0.bn2.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.5.0.bn2.running_var", "model.bert.mmbt.modal_encoder.encoder.model.5.0.bn2.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.5.0.conv3.weight", "model.bert.mmbt.modal_encoder.encoder.model.5.0.bn3.weight", "model.bert.mmbt.modal_encoder.encoder.model.5.0.bn3.bias", "model.bert.mmbt.modal_encoder.encoder.model.5.0.bn3.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.5.0.bn3.running_var", "model.bert.mmbt.modal_encoder.encoder.model.5.0.bn3.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.5.0.downsample.0.weight", "model.bert.mmbt.modal_encoder.encoder.model.5.0.downsample.1.weight", "model.bert.mmbt.modal_encoder.encoder.model.5.0.downsample.1.bias", "model.bert.mmbt.modal_encoder.encoder.model.5.0.downsample.1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.5.0.downsample.1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.5.0.downsample.1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.5.1.conv1.weight", "model.bert.mmbt.modal_encoder.encoder.model.5.1.bn1.weight", "model.bert.mmbt.modal_encoder.encoder.model.5.1.bn1.bias", "model.bert.mmbt.modal_encoder.encoder.model.5.1.bn1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.5.1.bn1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.5.1.bn1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.5.1.conv2.weight", "model.bert.mmbt.modal_encoder.encoder.model.5.1.bn2.weight", "model.bert.mmbt.modal_encoder.encoder.model.5.1.bn2.bias", "model.bert.mmbt.modal_encoder.encoder.model.5.1.bn2.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.5.1.bn2.running_var", "model.bert.mmbt.modal_encoder.encoder.model.5.1.bn2.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.5.1.conv3.weight", "model.bert.mmbt.modal_encoder.encoder.model.5.1.bn3.weight", "model.bert.mmbt.modal_encoder.encoder.model.5.1.bn3.bias", "model.bert.mmbt.modal_encoder.encoder.model.5.1.bn3.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.5.1.bn3.running_var", "model.bert.mmbt.modal_encoder.encoder.model.5.1.bn3.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.5.2.conv1.weight", "model.bert.mmbt.modal_encoder.encoder.model.5.2.bn1.weight", "model.bert.mmbt.modal_encoder.encoder.model.5.2.bn1.bias", "model.bert.mmbt.modal_encoder.encoder.model.5.2.bn1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.5.2.bn1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.5.2.bn1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.5.2.conv2.weight", "model.bert.mmbt.modal_encoder.encoder.model.5.2.bn2.weight", "model.bert.mmbt.modal_encoder.encoder.model.5.2.bn2.bias", "model.bert.mmbt.modal_encoder.encoder.model.5.2.bn2.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.5.2.bn2.running_var", "model.bert.mmbt.modal_encoder.encoder.model.5.2.bn2.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.5.2.conv3.weight", "model.bert.mmbt.modal_encoder.encoder.model.5.2.bn3.weight", "model.bert.mmbt.modal_encoder.encoder.model.5.2.bn3.bias", "model.bert.mmbt.modal_encoder.encoder.model.5.2.bn3.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.5.2.bn3.running_var", "model.bert.mmbt.modal_encoder.encoder.model.5.2.bn3.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.5.3.conv1.weight", "model.bert.mmbt.modal_encoder.encoder.model.5.3.bn1.weight", "model.bert.mmbt.modal_encoder.encoder.model.5.3.bn1.bias", "model.bert.mmbt.modal_encoder.encoder.model.5.3.bn1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.5.3.bn1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.5.3.bn1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.5.3.conv2.weight", "model.bert.mmbt.modal_encoder.encoder.model.5.3.bn2.weight", "model.bert.mmbt.modal_encoder.encoder.model.5.3.bn2.bias", "model.bert.mmbt.modal_encoder.encoder.model.5.3.bn2.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.5.3.bn2.running_var", "model.bert.mmbt.modal_encoder.encoder.model.5.3.bn2.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.5.3.conv3.weight", "model.bert.mmbt.modal_encoder.encoder.model.5.3.bn3.weight", "model.bert.mmbt.modal_encoder.encoder.model.5.3.bn3.bias", "model.bert.mmbt.modal_encoder.encoder.model.5.3.bn3.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.5.3.bn3.running_var", "model.bert.mmbt.modal_encoder.encoder.model.5.3.bn3.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.5.4.conv1.weight", "model.bert.mmbt.modal_encoder.encoder.model.5.4.bn1.weight", "model.bert.mmbt.modal_encoder.encoder.model.5.4.bn1.bias", "model.bert.mmbt.modal_encoder.encoder.model.5.4.bn1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.5.4.bn1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.5.4.bn1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.5.4.conv2.weight", "model.bert.mmbt.modal_encoder.encoder.model.5.4.bn2.weight", "model.bert.mmbt.modal_encoder.encoder.model.5.4.bn2.bias", "model.bert.mmbt.modal_encoder.encoder.model.5.4.bn2.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.5.4.bn2.running_var", "model.bert.mmbt.modal_encoder.encoder.model.5.4.bn2.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.5.4.conv3.weight", "model.bert.mmbt.modal_encoder.encoder.model.5.4.bn3.weight", "model.bert.mmbt.modal_encoder.encoder.model.5.4.bn3.bias", "model.bert.mmbt.modal_encoder.encoder.model.5.4.bn3.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.5.4.bn3.running_var", "model.bert.mmbt.modal_encoder.encoder.model.5.4.bn3.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.5.5.conv1.weight", "model.bert.mmbt.modal_encoder.encoder.model.5.5.bn1.weight", "model.bert.mmbt.modal_encoder.encoder.model.5.5.bn1.bias", "model.bert.mmbt.modal_encoder.encoder.model.5.5.bn1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.5.5.bn1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.5.5.bn1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.5.5.conv2.weight", "model.bert.mmbt.modal_encoder.encoder.model.5.5.bn2.weight", "model.bert.mmbt.modal_encoder.encoder.model.5.5.bn2.bias", "model.bert.mmbt.modal_encoder.encoder.model.5.5.bn2.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.5.5.bn2.running_var", "model.bert.mmbt.modal_encoder.encoder.model.5.5.bn2.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.5.5.conv3.weight", "model.bert.mmbt.modal_encoder.encoder.model.5.5.bn3.weight", "model.bert.mmbt.modal_encoder.encoder.model.5.5.bn3.bias", "model.bert.mmbt.modal_encoder.encoder.model.5.5.bn3.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.5.5.bn3.running_var", "model.bert.mmbt.modal_encoder.encoder.model.5.5.bn3.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.5.6.conv1.weight", "model.bert.mmbt.modal_encoder.encoder.model.5.6.bn1.weight", "model.bert.mmbt.modal_encoder.encoder.model.5.6.bn1.bias", "model.bert.mmbt.modal_encoder.encoder.model.5.6.bn1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.5.6.bn1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.5.6.bn1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.5.6.conv2.weight", "model.bert.mmbt.modal_encoder.encoder.model.5.6.bn2.weight", "model.bert.mmbt.modal_encoder.encoder.model.5.6.bn2.bias", "model.bert.mmbt.modal_encoder.encoder.model.5.6.bn2.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.5.6.bn2.running_var", "model.bert.mmbt.modal_encoder.encoder.model.5.6.bn2.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.5.6.conv3.weight", "model.bert.mmbt.modal_encoder.encoder.model.5.6.bn3.weight", "model.bert.mmbt.modal_encoder.encoder.model.5.6.bn3.bias", "model.bert.mmbt.modal_encoder.encoder.model.5.6.bn3.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.5.6.bn3.running_var", "model.bert.mmbt.modal_encoder.encoder.model.5.6.bn3.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.5.7.conv1.weight", "model.bert.mmbt.modal_encoder.encoder.model.5.7.bn1.weight", "model.bert.mmbt.modal_encoder.encoder.model.5.7.bn1.bias", "model.bert.mmbt.modal_encoder.encoder.model.5.7.bn1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.5.7.bn1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.5.7.bn1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.5.7.conv2.weight", "model.bert.mmbt.modal_encoder.encoder.model.5.7.bn2.weight", "model.bert.mmbt.modal_encoder.encoder.model.5.7.bn2.bias", "model.bert.mmbt.modal_encoder.encoder.model.5.7.bn2.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.5.7.bn2.running_var", "model.bert.mmbt.modal_encoder.encoder.model.5.7.bn2.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.5.7.conv3.weight", "model.bert.mmbt.modal_encoder.encoder.model.5.7.bn3.weight", "model.bert.mmbt.modal_encoder.encoder.model.5.7.bn3.bias", "model.bert.mmbt.modal_encoder.encoder.model.5.7.bn3.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.5.7.bn3.running_var", "model.bert.mmbt.modal_encoder.encoder.model.5.7.bn3.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.0.conv1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.0.bn1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.0.bn1.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.0.bn1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.0.bn1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.0.bn1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.0.conv2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.0.bn2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.0.bn2.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.0.bn2.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.0.bn2.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.0.bn2.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.0.conv3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.0.bn3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.0.bn3.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.0.bn3.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.0.bn3.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.0.bn3.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.0.downsample.0.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.0.downsample.1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.0.downsample.1.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.0.downsample.1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.0.downsample.1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.0.downsample.1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.1.conv1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.1.bn1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.1.bn1.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.1.bn1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.1.bn1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.1.bn1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.1.conv2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.1.bn2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.1.bn2.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.1.bn2.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.1.bn2.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.1.bn2.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.1.conv3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.1.bn3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.1.bn3.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.1.bn3.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.1.bn3.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.1.bn3.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.2.conv1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.2.bn1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.2.bn1.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.2.bn1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.2.bn1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.2.bn1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.2.conv2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.2.bn2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.2.bn2.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.2.bn2.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.2.bn2.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.2.bn2.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.2.conv3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.2.bn3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.2.bn3.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.2.bn3.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.2.bn3.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.2.bn3.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.3.conv1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.3.bn1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.3.bn1.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.3.bn1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.3.bn1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.3.bn1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.3.conv2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.3.bn2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.3.bn2.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.3.bn2.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.3.bn2.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.3.bn2.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.3.conv3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.3.bn3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.3.bn3.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.3.bn3.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.3.bn3.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.3.bn3.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.4.conv1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.4.bn1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.4.bn1.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.4.bn1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.4.bn1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.4.bn1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.4.conv2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.4.bn2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.4.bn2.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.4.bn2.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.4.bn2.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.4.bn2.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.4.conv3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.4.bn3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.4.bn3.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.4.bn3.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.4.bn3.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.4.bn3.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.5.conv1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.5.bn1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.5.bn1.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.5.bn1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.5.bn1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.5.bn1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.5.conv2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.5.bn2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.5.bn2.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.5.bn2.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.5.bn2.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.5.bn2.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.5.conv3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.5.bn3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.5.bn3.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.5.bn3.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.5.bn3.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.5.bn3.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.6.conv1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.6.bn1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.6.bn1.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.6.bn1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.6.bn1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.6.bn1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.6.conv2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.6.bn2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.6.bn2.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.6.bn2.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.6.bn2.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.6.bn2.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.6.conv3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.6.bn3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.6.bn3.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.6.bn3.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.6.bn3.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.6.bn3.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.7.conv1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.7.bn1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.7.bn1.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.7.bn1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.7.bn1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.7.bn1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.7.conv2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.7.bn2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.7.bn2.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.7.bn2.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.7.bn2.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.7.bn2.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.7.conv3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.7.bn3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.7.bn3.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.7.bn3.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.7.bn3.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.7.bn3.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.8.conv1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.8.bn1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.8.bn1.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.8.bn1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.8.bn1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.8.bn1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.8.conv2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.8.bn2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.8.bn2.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.8.bn2.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.8.bn2.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.8.bn2.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.8.conv3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.8.bn3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.8.bn3.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.8.bn3.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.8.bn3.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.8.bn3.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.9.conv1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.9.bn1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.9.bn1.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.9.bn1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.9.bn1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.9.bn1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.9.conv2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.9.bn2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.9.bn2.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.9.bn2.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.9.bn2.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.9.bn2.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.9.conv3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.9.bn3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.9.bn3.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.9.bn3.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.9.bn3.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.9.bn3.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.10.conv1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.10.bn1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.10.bn1.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.10.bn1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.10.bn1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.10.bn1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.10.conv2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.10.bn2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.10.bn2.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.10.bn2.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.10.bn2.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.10.bn2.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.10.conv3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.10.bn3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.10.bn3.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.10.bn3.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.10.bn3.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.10.bn3.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.11.conv1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.11.bn1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.11.bn1.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.11.bn1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.11.bn1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.11.bn1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.11.conv2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.11.bn2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.11.bn2.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.11.bn2.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.11.bn2.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.11.bn2.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.11.conv3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.11.bn3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.11.bn3.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.11.bn3.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.11.bn3.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.11.bn3.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.12.conv1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.12.bn1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.12.bn1.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.12.bn1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.12.bn1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.12.bn1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.12.conv2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.12.bn2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.12.bn2.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.12.bn2.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.12.bn2.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.12.bn2.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.12.conv3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.12.bn3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.12.bn3.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.12.bn3.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.12.bn3.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.12.bn3.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.13.conv1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.13.bn1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.13.bn1.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.13.bn1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.13.bn1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.13.bn1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.13.conv2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.13.bn2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.13.bn2.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.13.bn2.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.13.bn2.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.13.bn2.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.13.conv3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.13.bn3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.13.bn3.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.13.bn3.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.13.bn3.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.13.bn3.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.14.conv1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.14.bn1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.14.bn1.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.14.bn1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.14.bn1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.14.bn1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.14.conv2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.14.bn2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.14.bn2.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.14.bn2.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.14.bn2.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.14.bn2.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.14.conv3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.14.bn3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.14.bn3.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.14.bn3.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.14.bn3.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.14.bn3.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.15.conv1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.15.bn1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.15.bn1.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.15.bn1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.15.bn1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.15.bn1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.15.conv2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.15.bn2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.15.bn2.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.15.bn2.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.15.bn2.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.15.bn2.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.15.conv3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.15.bn3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.15.bn3.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.15.bn3.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.15.bn3.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.15.bn3.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.16.conv1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.16.bn1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.16.bn1.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.16.bn1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.16.bn1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.16.bn1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.16.conv2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.16.bn2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.16.bn2.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.16.bn2.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.16.bn2.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.16.bn2.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.16.conv3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.16.bn3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.16.bn3.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.16.bn3.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.16.bn3.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.16.bn3.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.17.conv1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.17.bn1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.17.bn1.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.17.bn1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.17.bn1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.17.bn1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.17.conv2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.17.bn2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.17.bn2.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.17.bn2.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.17.bn2.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.17.bn2.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.17.conv3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.17.bn3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.17.bn3.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.17.bn3.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.17.bn3.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.17.bn3.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.18.conv1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.18.bn1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.18.bn1.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.18.bn1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.18.bn1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.18.bn1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.18.conv2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.18.bn2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.18.bn2.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.18.bn2.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.18.bn2.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.18.bn2.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.18.conv3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.18.bn3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.18.bn3.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.18.bn3.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.18.bn3.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.18.bn3.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.19.conv1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.19.bn1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.19.bn1.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.19.bn1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.19.bn1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.19.bn1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.19.conv2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.19.bn2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.19.bn2.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.19.bn2.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.19.bn2.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.19.bn2.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.19.conv3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.19.bn3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.19.bn3.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.19.bn3.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.19.bn3.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.19.bn3.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.20.conv1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.20.bn1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.20.bn1.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.20.bn1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.20.bn1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.20.bn1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.20.conv2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.20.bn2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.20.bn2.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.20.bn2.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.20.bn2.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.20.bn2.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.20.conv3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.20.bn3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.20.bn3.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.20.bn3.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.20.bn3.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.20.bn3.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.21.conv1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.21.bn1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.21.bn1.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.21.bn1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.21.bn1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.21.bn1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.21.conv2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.21.bn2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.21.bn2.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.21.bn2.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.21.bn2.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.21.bn2.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.21.conv3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.21.bn3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.21.bn3.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.21.bn3.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.21.bn3.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.21.bn3.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.22.conv1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.22.bn1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.22.bn1.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.22.bn1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.22.bn1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.22.bn1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.22.conv2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.22.bn2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.22.bn2.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.22.bn2.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.22.bn2.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.22.bn2.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.22.conv3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.22.bn3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.22.bn3.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.22.bn3.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.22.bn3.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.22.bn3.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.23.conv1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.23.bn1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.23.bn1.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.23.bn1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.23.bn1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.23.bn1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.23.conv2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.23.bn2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.23.bn2.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.23.bn2.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.23.bn2.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.23.bn2.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.23.conv3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.23.bn3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.23.bn3.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.23.bn3.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.23.bn3.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.23.bn3.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.24.conv1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.24.bn1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.24.bn1.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.24.bn1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.24.bn1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.24.bn1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.24.conv2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.24.bn2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.24.bn2.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.24.bn2.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.24.bn2.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.24.bn2.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.24.conv3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.24.bn3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.24.bn3.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.24.bn3.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.24.bn3.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.24.bn3.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.25.conv1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.25.bn1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.25.bn1.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.25.bn1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.25.bn1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.25.bn1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.25.conv2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.25.bn2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.25.bn2.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.25.bn2.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.25.bn2.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.25.bn2.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.25.conv3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.25.bn3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.25.bn3.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.25.bn3.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.25.bn3.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.25.bn3.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.26.conv1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.26.bn1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.26.bn1.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.26.bn1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.26.bn1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.26.bn1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.26.conv2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.26.bn2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.26.bn2.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.26.bn2.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.26.bn2.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.26.bn2.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.26.conv3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.26.bn3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.26.bn3.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.26.bn3.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.26.bn3.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.26.bn3.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.27.conv1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.27.bn1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.27.bn1.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.27.bn1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.27.bn1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.27.bn1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.27.conv2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.27.bn2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.27.bn2.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.27.bn2.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.27.bn2.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.27.bn2.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.27.conv3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.27.bn3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.27.bn3.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.27.bn3.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.27.bn3.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.27.bn3.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.28.conv1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.28.bn1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.28.bn1.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.28.bn1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.28.bn1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.28.bn1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.28.conv2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.28.bn2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.28.bn2.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.28.bn2.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.28.bn2.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.28.bn2.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.28.conv3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.28.bn3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.28.bn3.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.28.bn3.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.28.bn3.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.28.bn3.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.29.conv1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.29.bn1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.29.bn1.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.29.bn1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.29.bn1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.29.bn1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.29.conv2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.29.bn2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.29.bn2.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.29.bn2.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.29.bn2.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.29.bn2.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.29.conv3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.29.bn3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.29.bn3.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.29.bn3.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.29.bn3.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.29.bn3.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.30.conv1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.30.bn1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.30.bn1.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.30.bn1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.30.bn1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.30.bn1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.30.conv2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.30.bn2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.30.bn2.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.30.bn2.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.30.bn2.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.30.bn2.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.30.conv3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.30.bn3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.30.bn3.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.30.bn3.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.30.bn3.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.30.bn3.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.31.conv1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.31.bn1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.31.bn1.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.31.bn1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.31.bn1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.31.bn1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.31.conv2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.31.bn2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.31.bn2.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.31.bn2.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.31.bn2.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.31.bn2.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.31.conv3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.31.bn3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.31.bn3.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.31.bn3.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.31.bn3.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.31.bn3.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.32.conv1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.32.bn1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.32.bn1.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.32.bn1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.32.bn1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.32.bn1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.32.conv2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.32.bn2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.32.bn2.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.32.bn2.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.32.bn2.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.32.bn2.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.32.conv3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.32.bn3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.32.bn3.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.32.bn3.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.32.bn3.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.32.bn3.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.33.conv1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.33.bn1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.33.bn1.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.33.bn1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.33.bn1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.33.bn1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.33.conv2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.33.bn2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.33.bn2.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.33.bn2.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.33.bn2.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.33.bn2.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.33.conv3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.33.bn3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.33.bn3.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.33.bn3.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.33.bn3.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.33.bn3.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.34.conv1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.34.bn1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.34.bn1.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.34.bn1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.34.bn1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.34.bn1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.34.conv2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.34.bn2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.34.bn2.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.34.bn2.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.34.bn2.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.34.bn2.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.34.conv3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.34.bn3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.34.bn3.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.34.bn3.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.34.bn3.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.34.bn3.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.35.conv1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.35.bn1.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.35.bn1.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.35.bn1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.35.bn1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.35.bn1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.35.conv2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.35.bn2.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.35.bn2.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.35.bn2.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.35.bn2.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.35.bn2.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.6.35.conv3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.35.bn3.weight", "model.bert.mmbt.modal_encoder.encoder.model.6.35.bn3.bias", "model.bert.mmbt.modal_encoder.encoder.model.6.35.bn3.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.6.35.bn3.running_var", "model.bert.mmbt.modal_encoder.encoder.model.6.35.bn3.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.7.0.conv1.weight", "model.bert.mmbt.modal_encoder.encoder.model.7.0.bn1.weight", "model.bert.mmbt.modal_encoder.encoder.model.7.0.bn1.bias", "model.bert.mmbt.modal_encoder.encoder.model.7.0.bn1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.7.0.bn1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.7.0.bn1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.7.0.conv2.weight", "model.bert.mmbt.modal_encoder.encoder.model.7.0.bn2.weight", "model.bert.mmbt.modal_encoder.encoder.model.7.0.bn2.bias", "model.bert.mmbt.modal_encoder.encoder.model.7.0.bn2.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.7.0.bn2.running_var", "model.bert.mmbt.modal_encoder.encoder.model.7.0.bn2.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.7.0.conv3.weight", "model.bert.mmbt.modal_encoder.encoder.model.7.0.bn3.weight", "model.bert.mmbt.modal_encoder.encoder.model.7.0.bn3.bias", "model.bert.mmbt.modal_encoder.encoder.model.7.0.bn3.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.7.0.bn3.running_var", "model.bert.mmbt.modal_encoder.encoder.model.7.0.bn3.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.7.0.downsample.0.weight", "model.bert.mmbt.modal_encoder.encoder.model.7.0.downsample.1.weight", "model.bert.mmbt.modal_encoder.encoder.model.7.0.downsample.1.bias", "model.bert.mmbt.modal_encoder.encoder.model.7.0.downsample.1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.7.0.downsample.1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.7.0.downsample.1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.7.1.conv1.weight", "model.bert.mmbt.modal_encoder.encoder.model.7.1.bn1.weight", "model.bert.mmbt.modal_encoder.encoder.model.7.1.bn1.bias", "model.bert.mmbt.modal_encoder.encoder.model.7.1.bn1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.7.1.bn1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.7.1.bn1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.7.1.conv2.weight", "model.bert.mmbt.modal_encoder.encoder.model.7.1.bn2.weight", "model.bert.mmbt.modal_encoder.encoder.model.7.1.bn2.bias", "model.bert.mmbt.modal_encoder.encoder.model.7.1.bn2.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.7.1.bn2.running_var", "model.bert.mmbt.modal_encoder.encoder.model.7.1.bn2.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.7.1.conv3.weight", "model.bert.mmbt.modal_encoder.encoder.model.7.1.bn3.weight", "model.bert.mmbt.modal_encoder.encoder.model.7.1.bn3.bias", "model.bert.mmbt.modal_encoder.encoder.model.7.1.bn3.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.7.1.bn3.running_var", "model.bert.mmbt.modal_encoder.encoder.model.7.1.bn3.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.7.2.conv1.weight", "model.bert.mmbt.modal_encoder.encoder.model.7.2.bn1.weight", "model.bert.mmbt.modal_encoder.encoder.model.7.2.bn1.bias", "model.bert.mmbt.modal_encoder.encoder.model.7.2.bn1.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.7.2.bn1.running_var", "model.bert.mmbt.modal_encoder.encoder.model.7.2.bn1.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.7.2.conv2.weight", "model.bert.mmbt.modal_encoder.encoder.model.7.2.bn2.weight", "model.bert.mmbt.modal_encoder.encoder.model.7.2.bn2.bias", "model.bert.mmbt.modal_encoder.encoder.model.7.2.bn2.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.7.2.bn2.running_var", "model.bert.mmbt.modal_encoder.encoder.model.7.2.bn2.num_batches_tracked", "model.bert.mmbt.modal_encoder.encoder.model.7.2.conv3.weight", "model.bert.mmbt.modal_encoder.encoder.model.7.2.bn3.weight", "model.bert.mmbt.modal_encoder.encoder.model.7.2.bn3.bias", "model.bert.mmbt.modal_encoder.encoder.model.7.2.bn3.running_mean", "model.bert.mmbt.modal_encoder.encoder.model.7.2.bn3.running_var", "model.bert.mmbt.modal_encoder.encoder.model.7.2.bn3.num_batches_tracked", "model.bert.mmbt.modal_encoder.proj_embeddings.weight", "model.bert.mmbt.modal_encoder.proj_embeddings.bias", "model.bert.mmbt.modal_encoder.position_embeddings.weight", "model.bert.mmbt.modal_encoder.token_type_embeddings.weight", "model.bert.mmbt.modal_encoder.word_embeddings.weight", "model.bert.mmbt.modal_encoder.LayerNorm.weight", "model.bert.mmbt.modal_encoder.LayerNorm.bias", "model.classifier.0.dense.weight", "model.classifier.0.dense.bias", "model.classifier.0.LayerNorm.weight", "model.classifier.0.LayerNorm.bias", "model.classifier.1.weight", "model.classifier.1.bias".
Can you try doing:
model.model.load_state_dict("your_path")
instead of
model.load_state_dict("your_path")
in your code?
Hi all,
I managed to make it work (I double checked the performance on test set)
Here is the code below for mmbt (visual bert requires the respective pre-processing):
filename_pth = './save/mmbt_final.pth'#your pth
model = MMBT("./save/config.yaml").from_pretrained('mmbt.hateful_memes.images') #note your config path
ckpt = torch.load(filename_pth)
own_state = model.state_dict()
temp = 0
for name, param in ckpt.items():
name = 'model.' + name
if name not in own_state:
print('fail')
continue
temp += 1
print('succes')
own_state[name].copy_(param)
print(temp)
#image_url = "something.jpg" #
#text = "something"
output = model.classify(filename, text)
plt.imshow(img)
plt.axis("off")
plt.show()
hateful = "Yes" if output["label"] == 1 else "No"
print("Hateful as per the model?", hateful)
print(f"Model's confidence: {output['confidence'] * 100:.3f}%")
❓ Questions and Help
Hi I have further trained the visual bert model with the hateful memes via command line as shown in the documentation. However I am having difficulty in understanding how I can add that model in main via a python code.
For example
model = MMBT.from_pretrained("mmbt.hateful_memes.images")
This works, but :
model = VisualBERT.from_pretrained('visual_bert.pretrained.hateful_memes')
This does not work. In addition, I tried adding the trained file 'visual_bert_final.pth', however I am uncertain how to call it in?
I have not found any documentation on this. The error message I get is:
Traceback (most recent call last): File "try1.py", line 13, in
model = VisualBERT.from_pretrained('visual_bert.pretrained.hateful_memes')
File "/home/michael/hackathon/mmf/mmf/models/base_model.py", line 223, in from_pretrained
output = load_pretrained_model(model_name_or_path, *args, *kwargs)
File "/home/michael/hackathon/mmf/mmf/utils/checkpoint.py", line 117, in load_pretrained_model
return _load_pretrained_model(model_name_or_path_or_checkpoint, args, kwargs)
File "/home/michael/hackathon/mmf/mmf/utils/checkpoint.py", line 71, in _load_pretrained_model
download_path = download_pretrained_model(model_name_or_path, args, **kwargs)
File "/home/michael/hackathon/mmf/mmf/utils/download.py", line 352, in download_pretrained_model
if "version" not in model_config or "resources" not in model_config:
TypeError: argument of type 'NoneType' is not iterable