simon-ging / coot-videotext

COOT: Cooperative Hierarchical Transformer for Video-Text Representation Learning
Apache License 2.0
288 stars 55 forks source link

Transformers older version failling, hack included #47

Closed Maddy12 closed 2 years ago

Maddy12 commented 2 years ago

Describe the bug When attempting to run precompute_text.py, there is an issue.

To Reproduce

 python precompute_text.py youcook2  --cuda --metadata_name "all_${PERTURBATION}"  --data_path ${DATA_PATH}

Screenshots

******************** Loading model bert-base-uncased from transformers
Running model on device cuda:0
Maximum input length 512
Loading meta file of 1 MB
Took 0.1 seconds for 1790.
Compute total_words and max_words: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1790/1790 [00:00<00:00, 247806.85it/s]
Total 115316 average 64.42 max 242
******************** Loading and testing dataset.
Keyword arguments {'add_special_tokens': False} not recognized.
Keyword arguments {'add_special_tokens': False} not recognized.
Keyword arguments {'add_special_tokens': False} not recognized.
Keyword arguments {'add_special_tokens': False} not recognized.
Keyword arguments {'add_special_tokens': False} not recognized.
Keyword arguments {'add_special_tokens': False} not recognized.
Keyword arguments {'add_special_tokens': False} not recognized.
Keyword arguments {'add_special_tokens': False} not recognized.
Keyword arguments {'add_special_tokens': False} not recognized.
Keyword arguments {'add_special_tokens': False} not recognized.
key: --bv0V6ZjWI

text: ['[CLS] crush and chop the garlic [SEP]', 'add oil garlic and salt to a bowl [SEP]', 'mix the tomoatoes with the oil mixture [SEP]', 'chop the basil [SEP]', 'spread the oil mixture onto the dough [SEP]', 'place provolone cheese and mozzarella cheese onto the dough [SEP]', 'add the basil to the pizza [SEP]', 'place the tomatoes on the pizza [SEP]', 'sprinkle cheese onto the pizza [SEP]', 'bake the pizza in an oven [SEP]']

text_tokenized: [['[CLS]', 'crush', 'and', 'chop', 'the', 'garlic', '[SEP]'], ['add', 'oil', 'garlic', 'and', 'salt', 'to', 'a', 'bowl', '[SEP]'], ['mix', 'the', 'tom', '##oat', '##oes', 'with', 'the', 'oil', 'mixture', '[SEP]'], ['chop', 'the', 'basil', '[SEP]'], ['spread', 'the', 'oil', 'mixture', 'onto', 'the', 'dough', '[SEP]'], ['place', 'pro', '##vo', '##lone', 'cheese', 'and', 'mo', '##zza', '##rella', 'cheese', 'onto', 'the', 'dough', '[SEP]'], ['add', 'the', 'basil', 'to', 'the', 'pizza', '[SEP]'], ['place', 'the', 'tomatoes', 'on', 'the', 'pizza', '[SEP]'], ['sp', '##rin', '##kle', 'cheese', 'onto', 'the', 'pizza', '[SEP]'], ['ba', '##ke', 'the', 'pizza', 'in', 'an', 'oven', '[SEP]']]

tokens: tensor([  101, 10188,  1998, 24494,  1996, 20548,   102,  5587,  3514, 20548,
         1998,  5474,  2000,  1037,  4605,   102,  4666,  1996,  3419, 16503,
        22504,  2007,  1996,  3514,  8150,   102, 24494,  1996, 14732,   102,
         3659,  1996,  3514,  8150,  3031,  1996, 23126,   102,  2173,  4013,
         6767, 27165,  8808,  1998,  9587, 20715, 21835,  8808,  3031,  1996,
        23126,   102,  5587,  1996, 14732,  2000,  1996, 10733,   102,  2173,
         1996, 12851,  2006,  1996, 10733,   102, 11867,  6657, 19099,  8808,
         3031,  1996, 10733,   102,  8670,  3489,  1996, 10733,  1999,  2019,
        17428,   102])

sentence_lengths: [7, 9, 10, 4, 8, 14, 7, 7, 8, 8]

******************** Running the encoding.
Encoding text with model: bert-base-uncased, layers: [-2, -1], batch size: 1, workers: 0
compute text features:   0%|                                                                                                                                                                                              | 0/1790 [00:00<?, ?it/s]Keyword arguments {'add_special_tokens': False} not recognized.
Keyword arguments {'add_special_tokens': False} not recognized.
Keyword arguments {'add_special_tokens': False} not recognized.
Keyword arguments {'add_special_tokens': False} not recognized.
Keyword arguments {'add_special_tokens': False} not recognized.
Keyword arguments {'add_special_tokens': False} not recognized.
Keyword arguments {'add_special_tokens': False} not recognized.
Keyword arguments {'add_special_tokens': False} not recognized.
Keyword arguments {'add_special_tokens': False} not recognized.
Keyword arguments {'add_special_tokens': False} not recognized.
Traceback (most recent call last):
  File "models/coot-videotext/precompute_text.py", line 460, in <module>
    main()
  File "/home/.conda/envs/multimodal_py38/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "models/coot-videotext/precompute_text.py", line 210, in main
    hidden_states = model_outputs["hidden_states"]
TypeError: tuple indices must be integers or slices, not str
compute text features:   0%|               

The current output of model_outputs is a tuple:

(pdb) model_outputs[1].shape
torch.Size([1, 768])
(Pdb) model_outputs[0].shape
torch.Size([1, 82, 768])

System Info: transformers==3.4.0

Additional context I am sure it is an issue with the versioning of transformers, so package information on what version is required would be helpful as it is not indicated in the requirements.txt

Maddy12 commented 2 years ago

My hack based on the current version of transformers is the following:

tokenizer = AutoTokenizer.from_pretrained(model_name, cache_dir=args.model_path,
                                                  add_special_tokens=args.add_special_tokens)
config = BertConfig.from_pretrained(model_name, output_hidden_states=True)
model = BertModel.from_pretrained(model_name, config=config)

and to no longer pass add_special_tokens in function TextConverterDataset when calling:

 sentence_tokens_str = self.tokenizer.tokenize(sentence) # , add_special_tokens=self.add_special_tokens)
Maddy12 commented 2 years ago

I see the tranformers version now in requrements_frozen.txt. The hack may be useful if want to update to more recent versions. Thanks!