Open adalisan opened 5 years ago
I'm having similar issue: `06/29/2020 20:52:32 - INFO - main - device: cuda n_gpu: 4, distributed training: False, 16-bits training: False 06/29/2020 20:52:32 - INFO - pytorch_pretrained_bert.tokenization - loading vocabulary file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at /home/ubuntu/.pytorch_pretrained_bert/26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084 06/29/2020 20:52:33 - INFO - vilbert.task_utils - Loading RetrievalFlickr30k Dataset with batch size 1 06/29/2020 20:52:37 - INFO - vilbert.vilbert - loading archive file save/bert_base_6_layer_6_connect/pytorch_model_9.bin 06/29/2020 20:52:37 - INFO - vilbert.vilbert - Model config { "attention_probs_dropout_prob": 0.1, "bi_attention_type": 1, "bi_hidden_size": 1024, "bi_intermediate_size": 1024, "bi_num_attention_heads": 8, "fast_mode": true, "fixed_t_layer": 0, "fixed_v_layer": 0, "fusion_method": "mul", "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 768, "in_batch_pairs": false, "initializer_range": 0.02, "intermediate_size": 3072, "intra_gate": false, "max_position_embeddings": 512, "num_attention_heads": 12, "num_hidden_layers": 12, "pooling_method": "mul", "predict_feature": false, "t_biattention_id": [ 6, 7, 8, 9, 10, 11 ], "type_vocab_size": 2, "v_attention_probs_dropout_prob": 0.1, "v_biattention_id": [ 0, 1, 2, 3, 4, 5 ], "v_feature_size": 2048, "v_hidden_act": "gelu", "v_hidden_dropout_prob": 0.1, "v_hidden_size": 1024, "v_initializer_range": 0.02, "v_intermediate_size": 1024, "v_num_attention_heads": 8, "v_num_hidden_layers": 6, "v_target_size": 1601, "vocab_size": 30522, "with_coattention": true }
model's option for predict_feature is False
06/29/2020 20:52:44 - INFO - vilbert.vilbert - Weights from pretrained model not used in BertForMultiModalPreTraining: ['bert.encoder.v_layer.6.attention.self.query.weight', 'bert.encoder.v_layer.6.attention.self.query.bias', 'bert.encoder.v_layer.6.attention.self.key.weight', 'bert.encoder.v_layer.6.attention.self.key.bias', 'bert.encoder.v_layer.6.attention.self.value.weight', 'bert.encoder.v_layer.6.attention.self.value.bias', 'bert.encoder.v_layer.6.attention.output.dense.weight', 'bert.encoder.v_layer.6.attention.output.dense.bias', 'bert.encoder.v_layer.6.attention.output.LayerNorm.weight', 'bert.encoder.v_layer.6.attention.output.LayerNorm.bias', 'bert.encoder.v_layer.6.intermediate.dense.weight', 'bert.encoder.v_layer.6.intermediate.dense.bias', 'bert.encoder.v_layer.6.output.dense.weight', 'bert.encoder.v_layer.6.output.dense.bias', 'bert.encoder.v_layer.6.output.LayerNorm.weight', 'bert.encoder.v_layer.6.output.LayerNorm.bias', 'bert.encoder.v_layer.7.attention.self.query.weight', 'bert.encoder.v_layer.7.attention.self.query.bias', 'bert.encoder.v_layer.7.attention.self.key.weight', 'bert.encoder.v_layer.7.attention.self.key.bias', 'bert.encoder.v_layer.7.attention.self.value.weight', 'bert.encoder.v_layer.7.attention.self.value.bias', 'bert.encoder.v_layer.7.attention.output.dense.weight', 'bert.encoder.v_layer.7.attention.output.dense.bias', 'bert.encoder.v_layer.7.attention.output.LayerNorm.weight', 'bert.encoder.v_layer.7.attention.output.LayerNorm.bias', 'bert.encoder.v_layer.7.intermediate.dense.weight', 'bert.encoder.v_layer.7.intermediate.dense.bias', 'bert.encoder.v_layer.7.output.dense.weight', 'bert.encoder.v_layer.7.output.dense.bias', 'bert.encoder.v_layer.7.output.LayerNorm.weight', 'bert.encoder.v_layer.7.output.LayerNorm.bias', 'bert.encoder.c_layer.6.biattention.query1.weight', 'bert.encoder.c_layer.6.biattention.query1.bias', 'bert.encoder.c_layer.6.biattention.key1.weight', 'bert.encoder.c_layer.6.biattention.key1.bias', 'bert.encoder.c_layer.6.biattention.value1.weight', 'bert.encoder.c_layer.6.biattention.value1.bias', 'bert.encoder.c_layer.6.biattention.query2.weight', 'bert.encoder.c_layer.6.biattention.query2.bias', 'bert.encoder.c_layer.6.biattention.key2.weight', 'bert.encoder.c_layer.6.biattention.key2.bias', 'bert.encoder.c_layer.6.biattention.value2.weight', 'bert.encoder.c_layer.6.biattention.value2.bias', 'bert.encoder.c_layer.6.biOutput.dense1.weight', 'bert.encoder.c_layer.6.biOutput.dense1.bias', 'bert.encoder.c_layer.6.biOutput.LayerNorm1.weight', 'bert.encoder.c_layer.6.biOutput.LayerNorm1.bias', 'bert.encoder.c_layer.6.biOutput.q_dense1.weight', 'bert.encoder.c_layer.6.biOutput.q_dense1.bias', 'bert.encoder.c_layer.6.biOutput.dense2.weight', 'bert.encoder.c_layer.6.biOutput.dense2.bias', 'bert.encoder.c_layer.6.biOutput.LayerNorm2.weight', 'bert.encoder.c_layer.6.biOutput.LayerNorm2.bias', 'bert.encoder.c_layer.6.biOutput.q_dense2.weight', 'bert.encoder.c_layer.6.biOutput.q_dense2.bias', 'bert.encoder.c_layer.6.v_intermediate.dense.weight', 'bert.encoder.c_layer.6.v_intermediate.dense.bias', 'bert.encoder.c_layer.6.v_output.dense.weight', 'bert.encoder.c_layer.6.v_output.dense.bias', 'bert.encoder.c_layer.6.v_output.LayerNorm.weight', 'bert.encoder.c_layer.6.v_output.LayerNorm.bias', 'bert.encoder.c_layer.6.t_intermediate.dense.weight', 'bert.encoder.c_layer.6.t_intermediate.dense.bias', 'bert.encoder.c_layer.6.t_output.dense.weight', 'bert.encoder.c_layer.6.t_output.dense.bias', 'bert.encoder.c_layer.6.t_output.LayerNorm.weight', 'bert.encoder.c_layer.6.t_output.LayerNorm.bias', 'bert.encoder.c_layer.7.biattention.query1.weight', 'bert.encoder.c_layer.7.biattention.query1.bias', 'bert.encoder.c_layer.7.biattention.key1.weight', 'bert.encoder.c_layer.7.biattention.key1.bias', 'bert.encoder.c_layer.7.biattention.value1.weight', 'bert.encoder.c_layer.7.biattention.value1.bias', 'bert.encoder.c_layer.7.biattention.query2.weight', 'bert.encoder.c_layer.7.biattention.query2.bias', 'bert.encoder.c_layer.7.biattention.key2.weight', 'bert.encoder.c_layer.7.biattention.key2.bias', 'bert.encoder.c_layer.7.biattention.value2.weight', 'bert.encoder.c_layer.7.biattention.value2.bias', 'bert.encoder.c_layer.7.biOutput.dense1.weight', 'bert.encoder.c_layer.7.biOutput.dense1.bias', 'bert.encoder.c_layer.7.biOutput.LayerNorm1.weight', 'bert.encoder.c_layer.7.biOutput.LayerNorm1.bias', 'bert.encoder.c_layer.7.biOutput.q_dense1.weight', 'bert.encoder.c_layer.7.biOutput.q_dense1.bias', 'bert.encoder.c_layer.7.biOutput.dense2.weight', 'bert.encoder.c_layer.7.biOutput.dense2.bias', 'bert.encoder.c_layer.7.biOutput.LayerNorm2.weight', 'bert.encoder.c_layer.7.biOutput.LayerNorm2.bias', 'bert.encoder.c_layer.7.biOutput.q_dense2.weight', 'bert.encoder.c_layer.7.biOutput.q_dense2.bias', 'bert.encoder.c_layer.7.v_intermediate.dense.weight', 'bert.encoder.c_layer.7.v_intermediate.dense.bias', 'bert.encoder.c_layer.7.v_output.dense.weight', 'bert.encoder.c_layer.7.v_output.dense.bias', 'bert.encoder.c_layer.7.v_output.LayerNorm.weight', 'bert.encoder.c_layer.7.v_output.LayerNorm.bias', 'bert.encoder.c_layer.7.t_intermediate.dense.weight', 'bert.encoder.c_layer.7.t_intermediate.dense.bias', 'bert.encoder.c_layer.7.t_output.dense.weight', 'bert.encoder.c_layer.7.t_output.dense.bias', 'bert.encoder.c_layer.7.t_output.LayerNorm.weight', 'bert.encoder.c_layer.7.t_output.LayerNorm.bias']
Num Iters: {'TASK3': 10000}
Batch size: {'TASK3': 1}
Traceback (most recent call last):
File "eval_retrieval.py", line 275, in
I get the the tensor size error at the end. The command I am running is this:
python eval_retrieval.py --bert_model bert-base-uncased --from_pretrained save/RetrievalFlickr30k_bert_base_6layer_6conect-pretrained/pytorch_model_19.bin --config_file config/bert_base_6layer_6conect.json --task 3 --split test --batch_size 1
I get the same error if I try the zeroshot also
python3 ./eval_retrieval.py --bert_model bert-base-uncased --from_pretrained save/bert_base_6_layer_6_connect/pytorch_model_9.bin --config_file config/bert_base_6layer_6conect.json --task 3 --split test --batch_size 1 --zero_shot