cooelf / SemBERT

Semantics-aware BERT for Language Understanding (AAAI 2020)
https://arxiv.org/abs/1909.02209
MIT License
285 stars 55 forks source link

RuntimeError: size mismatch, m1: [32 x 78], m2: [778 x 778] #25

Closed drussellmrichie closed 2 years ago

drussellmrichie commented 2 years ago

Hi, @cooelf. Thanks again for providing this really interesting work! I am trying to get the code working so I can ultimately adapt it for some other work, and I'm running into some trouble. Hope you can clarify.

I have cloned the repository to my server, set up a Py3.6 virtual environment, installed PyTorch (1.0.0), AllenNLP (0.8.1), and spacy==2.0.18 (also did pip install --pre allennlp-models), and downloaded the GLUE data. I then run the following:

CUDA_VISIBLE_DEVICES=0 \
python run_classifier.py \
--data_dir glue_data/MNLI \
--task_name MNLI \
--train_batch_size 32 \
--max_seq_length 128 \
--bert_model bert-base-uncased \
--learning_rate 2e-5 \
--num_train_epochs 2 \
--do_train \
--do_eval \
--do_lower_case \
--max_num_aspect 3 \
--output_dir glue/MNLI_model_dir

(You'll notice that I replaced 'bert-wwm-uncased' with 'bert-base-uncased' because I got an error when I tried to use 'bert-wwm-uncased'. I know #8 asks about this as well but even with Google Translate I was not able to figure out how to solve the issue with 'bert-wwm-uncased'.)

I get the following output.

Better speed can be achieved with apex installed from https://www.github.com/nvidia/apex.
/home/richier/anaconda3/envs/py36_sembert/lib/python3.6/site-packages/sklearn/utils/linear_assignment_.py:22: FutureWarning: The linear_assignment_ module is deprecated in 0.21 and will be removed from 0.23. Use scipy.optimize.linear_sum_assignment instead.
  FutureWarning)
11/03/2021 12:21:35 - INFO - __main__ -   device: cuda n_gpu: 1, distributed training: False, 16-bits training: False
11/03/2021 12:21:35 - INFO - pytorch_pretrained_bert.tokenization -   loading vocabulary file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at /home/richier/.pytorch_pretrained_bert/26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084
Traceback (most recent call last):
  File "run_classifier.py", line 1237, in <module>
    main()
  File "run_classifier.py", line 857, in main
    train_examples = processor.get_train_examples(args.data_dir)
  File "run_classifier.py", line 408, in get_train_examples
    self._read_tsv(os.path.join(data_dir, "train.tsv_tag_label")), "train")
  File "run_classifier.py", line 91, in _read_tsv
    with open(input_file, "r", encoding='utf-8') as f:
FileNotFoundError: [Errno 2] No such file or directory: 'glue_data/SNLI/train.tsv_tag_label'
(py36_sembert) [richier@reslnapollo02 SemBERT]$ bash train.sh 
Better speed can be achieved with apex installed from https://www.github.com/nvidia/apex.
/home/richier/anaconda3/envs/py36_sembert/lib/python3.6/site-packages/sklearn/utils/linear_assignment_.py:22: FutureWarning: The linear_assignment_ module is deprecated in 0.21 and will be removed from 0.23. Use scipy.optimize.linear_sum_assignment instead.
  FutureWarning)
11/03/2021 12:21:54 - INFO - __main__ -   device: cuda n_gpu: 1, distributed training: False, 16-bits training: False
11/03/2021 12:21:54 - INFO - pytorch_pretrained_bert.tokenization -   loading vocabulary file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at /home/richier/.pytorch_pretrained_bert/26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084
{'contradiction': 0, 'entailment': 1, 'neutral': 2}
11/03/2021 12:21:54 - INFO - __main__ -   *** Example ***
11/03/2021 12:21:54 - INFO - __main__ -   guid: train-0
11/03/2021 12:21:54 - INFO - __main__ -   tokens: [CLS] conceptual ##ly cream ski ##mming has two basic dimensions - product and geography . [SEP] product and geography are what make cream ski ##mming work . [SEP]
11/03/2021 12:21:54 - INFO - __main__ -   input_ids: 101 17158 2135 6949 8301 25057 2038 2048 3937 9646 1011 4031 1998 10505 1012 102 4031 1998 10505 2024 2054 2191 6949 8301 25057 2147 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:21:54 - INFO - __main__ -   input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:21:54 - INFO - __main__ -   segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:21:54 - INFO - __main__ -   label: neutral (id = 2)
11/03/2021 12:21:54 - INFO - __main__ -   *** Example ***
11/03/2021 12:21:54 - INFO - __main__ -   guid: train-1
11/03/2021 12:21:54 - INFO - __main__ -   tokens: [CLS] you know during the season and i guess at at your level uh you lose them to the next level if if they decide to recall the the parent team the braves decide to call to recall a guy from triple a then a double a guy goes up to replace him and a single a guy goes up to replace him [SEP] you lose the things to the following level if the people recall . [SEP]
11/03/2021 12:21:54 - INFO - __main__ -   input_ids: 101 2017 2113 2076 1996 2161 1998 1045 3984 2012 2012 2115 2504 7910 2017 4558 2068 2000 1996 2279 2504 2065 2065 2027 5630 2000 9131 1996 1996 6687 2136 1996 13980 5630 2000 2655 2000 9131 1037 3124 2013 6420 1037 2059 1037 3313 1037 3124 3632 2039 2000 5672 2032 1998 1037 2309 1037 3124 3632 2039 2000 5672 2032 102 2017 4558 1996 2477 2000 1996 2206 2504 2065 1996 2111 9131 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:21:54 - INFO - __main__ -   input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:21:54 - INFO - __main__ -   segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:21:54 - INFO - __main__ -   label: entailment (id = 1)
11/03/2021 12:21:54 - INFO - __main__ -   *** Example ***
11/03/2021 12:21:54 - INFO - __main__ -   guid: train-2
11/03/2021 12:21:54 - INFO - __main__ -   tokens: [CLS] one of our number will carry out your instructions minute ##ly . [SEP] a member of my team will execute your orders with immense precision . [SEP]
11/03/2021 12:21:54 - INFO - __main__ -   input_ids: 101 2028 1997 2256 2193 2097 4287 2041 2115 8128 3371 2135 1012 102 1037 2266 1997 2026 2136 2097 15389 2115 4449 2007 14269 11718 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:21:54 - INFO - __main__ -   input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:21:54 - INFO - __main__ -   segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:21:54 - INFO - __main__ -   label: entailment (id = 1)
11/03/2021 12:21:54 - INFO - __main__ -   *** Example ***
11/03/2021 12:21:54 - INFO - __main__ -   guid: train-3
11/03/2021 12:21:54 - INFO - __main__ -   tokens: [CLS] how do you know ? all this is their information again . [SEP] this information belongs to them . [SEP]
11/03/2021 12:21:54 - INFO - __main__ -   input_ids: 101 2129 2079 2017 2113 1029 2035 2023 2003 2037 2592 2153 1012 102 2023 2592 7460 2000 2068 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:21:54 - INFO - __main__ -   input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:21:54 - INFO - __main__ -   segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:21:54 - INFO - __main__ -   label: entailment (id = 1)
11/03/2021 12:21:54 - INFO - __main__ -   *** Example ***
11/03/2021 12:21:54 - INFO - __main__ -   guid: train-4
11/03/2021 12:21:54 - INFO - __main__ -   tokens: [CLS] yeah i tell you what though if you go price some of those tennis shoes i can see why now you know they ' re getting up in the hundred dollar range [SEP] the tennis shoes have a range of prices . [SEP]
11/03/2021 12:21:54 - INFO - __main__ -   input_ids: 101 3398 1045 2425 2017 2054 2295 2065 2017 2175 3976 2070 1997 2216 5093 6007 1045 2064 2156 2339 2085 2017 2113 2027 1005 2128 2893 2039 1999 1996 3634 7922 2846 102 1996 5093 6007 2031 1037 2846 1997 7597 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:21:54 - INFO - __main__ -   input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:21:54 - INFO - __main__ -   segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:21:54 - INFO - __main__ -   label: neutral (id = 2)
tokenizer vocab size:  22
11/03/2021 12:21:54 - INFO - pytorch_pretrained_bert.modeling -   loading archive file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased.tar.gz from cache at /home/richier/.pytorch_pretrained_bert/distributed_-1/9c41111e2de84547a463fd39217199738d1e3deb72d4fec4399e6e241983c6f0.ae3cef932725ca7a30cdcb93fc6e09150a55e2a130ec7af63975a16c153ae2ba
11/03/2021 12:21:54 - INFO - pytorch_pretrained_bert.modeling -   extracting archive file /home/richier/.pytorch_pretrained_bert/distributed_-1/9c41111e2de84547a463fd39217199738d1e3deb72d4fec4399e6e241983c6f0.ae3cef932725ca7a30cdcb93fc6e09150a55e2a130ec7af63975a16c153ae2ba to temp dir /tmp/tmppi9y9ey2
11/03/2021 12:22:00 - INFO - pytorch_pretrained_bert.modeling -   Model config {
  "attention_probs_dropout_prob": 0.1,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "max_position_embeddings": 512,
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "type_vocab_size": 2,
  "vocab_size": 30522
}

11/03/2021 12:22:06 - INFO - pytorch_pretrained_bert.modeling -   Weights of BertForSequenceClassificationTag not initialized from pretrained model: ['cnn.char_cnn.weight', 'cnn.char_cnn.bias', 'tag_model.embed.tag_embeddings.weight', 'tag_model.embed.LayerNorm.weight', 'tag_model.embed.LayerNorm.bias', 'tag_model.fc.weight', 'tag_model.fc.bias', 'dense.weight', 'dense.bias', 'pool.weight', 'pool.bias', 'classifier.weight', 'classifier.bias']
11/03/2021 12:22:06 - INFO - pytorch_pretrained_bert.modeling -   Weights from pretrained model not used in BertForSequenceClassificationTag: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias']
11/03/2021 12:22:10 - INFO - __main__ -   ***** Running training *****
11/03/2021 12:22:10 - INFO - __main__ -     Num examples = 51
11/03/2021 12:22:10 - INFO - __main__ -     Batch size = 32
11/03/2021 12:22:10 - INFO - __main__ -     Num steps = 2
{'contradiction': 0, 'entailment': 1, 'neutral': 2}
11/03/2021 12:22:10 - INFO - __main__ -   *** Example ***
11/03/2021 12:22:10 - INFO - __main__ -   guid: dev_matched-0
11/03/2021 12:22:10 - INFO - __main__ -   tokens: [CLS] the new rights are nice enough [SEP] everyone really likes the newest benefits [SEP]
11/03/2021 12:22:10 - INFO - __main__ -   input_ids: 101 1996 2047 2916 2024 3835 2438 102 3071 2428 7777 1996 14751 6666 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:22:10 - INFO - __main__ -   input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:22:10 - INFO - __main__ -   segment_ids: 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:22:10 - INFO - __main__ -   label: neutral (id = 2)
11/03/2021 12:22:10 - INFO - __main__ -   *** Example ***
11/03/2021 12:22:10 - INFO - __main__ -   guid: dev_matched-1
11/03/2021 12:22:10 - INFO - __main__ -   tokens: [CLS] this site includes a list of all award winners and a search ##able database of government executive articles . [SEP] the government executive articles housed on the website are not able to be searched . [SEP]
11/03/2021 12:22:10 - INFO - __main__ -   input_ids: 101 2023 2609 2950 1037 2862 1997 2035 2400 4791 1998 1037 3945 3085 7809 1997 2231 3237 4790 1012 102 1996 2231 3237 4790 7431 2006 1996 4037 2024 2025 2583 2000 2022 9022 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:22:10 - INFO - __main__ -   input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:22:10 - INFO - __main__ -   segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:22:10 - INFO - __main__ -   label: contradiction (id = 0)
11/03/2021 12:22:10 - INFO - __main__ -   *** Example ***
11/03/2021 12:22:10 - INFO - __main__ -   guid: dev_matched-2
11/03/2021 12:22:10 - INFO - __main__ -   tokens: [CLS] uh i do n ' t know i i have mixed emotions about him uh sometimes i like him but at the same times i love to see somebody beat him [SEP] i like him for the most part , but would still enjoy seeing someone beat him . [SEP]
11/03/2021 12:22:10 - INFO - __main__ -   input_ids: 101 7910 1045 2079 1050 1005 1056 2113 1045 1045 2031 3816 6699 2055 2032 7910 2823 1045 2066 2032 2021 2012 1996 2168 2335 1045 2293 2000 2156 8307 3786 2032 102 1045 2066 2032 2005 1996 2087 2112 1010 2021 2052 2145 5959 3773 2619 3786 2032 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:22:10 - INFO - __main__ -   input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:22:10 - INFO - __main__ -   segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:22:10 - INFO - __main__ -   label: entailment (id = 1)
11/03/2021 12:22:10 - INFO - __main__ -   *** Example ***
11/03/2021 12:22:10 - INFO - __main__ -   guid: dev_matched-3
11/03/2021 12:22:10 - INFO - __main__ -   tokens: [CLS] yeah i i think my favorite restaurant is always been the one closest you know the closest as long as it ' s it meets the minimum criteria you know of good food [SEP] my favorite restaurants are always at least a hundred miles away from my house . [SEP]
11/03/2021 12:22:10 - INFO - __main__ -   input_ids: 101 3398 1045 1045 2228 2026 5440 4825 2003 2467 2042 1996 2028 7541 2017 2113 1996 7541 2004 2146 2004 2009 1005 1055 2009 6010 1996 6263 9181 2017 2113 1997 2204 2833 102 2026 5440 7884 2024 2467 2012 2560 1037 3634 2661 2185 2013 2026 2160 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:22:10 - INFO - __main__ -   input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:22:10 - INFO - __main__ -   segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:22:10 - INFO - __main__ -   label: contradiction (id = 0)
11/03/2021 12:22:10 - INFO - __main__ -   *** Example ***
11/03/2021 12:22:10 - INFO - __main__ -   guid: dev_matched-4
11/03/2021 12:22:10 - INFO - __main__ -   tokens: [CLS] i do n ' t know um do you do a lot of camping [SEP] i know exactly . [SEP]
11/03/2021 12:22:10 - INFO - __main__ -   input_ids: 101 1045 2079 1050 1005 1056 2113 8529 2079 2017 2079 1037 2843 1997 13215 102 1045 2113 3599 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:22:10 - INFO - __main__ -   input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:22:10 - INFO - __main__ -   segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:22:10 - INFO - __main__ -   label: contradiction (id = 0)
11/03/2021 12:22:10 - INFO - __main__ -   ***** Running evaluation *****
11/03/2021 12:22:10 - INFO - __main__ -     Num examples = 51
11/03/2021 12:22:10 - INFO - __main__ -     Batch size = 8
Iteration:   0%|                                                                                                                                                 | 0/2 [00:00<?, ?it/s]
Epoch:   0%|                                                                                                                                                     | 0/2 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "run_classifier.py", line 1237, in <module>
    main()
  File "run_classifier.py", line 975, in main
    loss = model(input_ids, segment_ids, input_mask, start_end_idx, input_tag_ids,  label_ids)
  File "/home/richier/anaconda3/envs/py36_sembert/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/richier/SemBERT/pytorch_pretrained_bert/modeling.py", line 1026, in forward
    pooled_output = self.pool(first_token_tensor)
  File "/home/richier/anaconda3/envs/py36_sembert/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/richier/anaconda3/envs/py36_sembert/lib/python3.6/site-packages/torch/nn/modules/linear.py", line 67, in forward
    return F.linear(input, self.weight, self.bias)
  File "/home/richier/anaconda3/envs/py36_sembert/lib/python3.6/site-packages/torch/nn/functional.py", line 1352, in linear
    ret = torch.addmm(torch.jit._unwrap_optional(bias), input, weight.t())
RuntimeError: size mismatch, m1: [32 x 78], m2: [778 x 778] at /pytorch/aten/src/THC/generic/THCTensorMathBlas.cu:266

That seems to be saying that there's some mismatch in some matrix shapes? Any idea what's causing that?

rik-tak commented 2 years ago

@drussellmrichie I have faced with the slimilar problem.

Reverting this commit solved my problem. The commit causes mismatch of the tensor size. https://github.com/cooelf/SemBERT/commit/f849452f864b5dd47f94e2911cffc15e9f6a5a2a

drussellmrichie commented 2 years ago

@a1phamath That seems to have worked! Thank you so much!