yixinL7 / BRIO

ACL 2022: BRIO: Bringing Order to Abstractive Summarization
330 stars 43 forks source link

HuggingFace Tokenizer Loading #1

Closed griff4692 closed 2 years ago

griff4692 commented 2 years ago

Hi Yixin - thanks for sharing the repo and putting the pre-trained models on HuggingFace.

Unfortunately, though, I'm having trouble loading the tokenizer for CNN/DM. Any thoughts / suggestions? Thanks


Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/griffin/sum/lib/python3.8/site-packages/transformers/models/auto/tokenization_auto.py", line 546, in from_pretrained
    return tokenizer_class_fast.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
  File "/home/griffin/sum/lib/python3.8/site-packages/transformers/tokenization_utils_base.py", line 1788, in from_pretrained
    return cls._from_pretrained(
  File "/home/griffin/sum/lib/python3.8/site-packages/transformers/tokenization_utils_base.py", line 1923, in _from_pretrained
    tokenizer = cls(*init_inputs, **init_kwargs)
  File "/home/griffin/sum/lib/python3.8/site-packages/transformers/models/bart/tokenization_bart_fast.py", line 171, in __init__
    super().__init__(
  File "/home/griffin/sum/lib/python3.8/site-packages/transformers/tokenization_utils_fast.py", line 110, in __init__
    fast_tokenizer = TokenizerFast.from_file(fast_tokenizer_file)
Exception: No such file or directory (os error 2)```
griff4692 commented 2 years ago

Ah - AutoTokenizer.from_pretrained(...) doesn't work but BartTokenizer.from_pretrained(...) does. If possible (for others), you can just make a note on HuggingFace or change the "Use with Transformers" snippet. Thanks!