allenai / longformer

Longformer: The Long-Document Transformer
https://arxiv.org/abs/2004.05150
Apache License 2.0
2.05k stars 276 forks source link

Correct way of loading pretrained model led-base-16384 #205

Open kgarg8 opened 3 years ago

kgarg8 commented 3 years ago

This issue discusses about the difference between HuggingFace LED and AllenAI LED. What is the correct way of loading AllenAI's pretrained model led-base-16384?

Approach 1 using HuggingFace LED: Using transformers v4.9.1

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("allenai/led-base-16384")

model = AutoModelForSeq2SeqLM.from_pretrained("allenai/led-base-16384", gradient_checkpointing=True)

Approach 2 using AllenAI LED: Using transformer version suggested in requirements.txt: git+http://github.com/ibeltagy/transformers.git@longformer_encoder_decoder#egg=transformers

from transformers import AutoTokenizer
from longformer.longformer_encoder_decoder import LongformerEncoderDecoderForConditionalGeneration
tokenizer = AutoTokenizer.from_pretrained("allenai/led-base-16384")
model = LongformerEncoderDecoderForConditionalGeneration.from_pretrained("allenai/led-base-16384", gradient_checkpointing=True)

Results: Approach 1 seems to work but I am not sure whether it is correct because we are loading AllenAI's pretrained LED using HuggingLED which has different attention window size.

Approach 2 produces the error:

  File "/Users/krishna/opt/anaconda3/envs/CiteKP/lib/python3.8/site-packages/transformers/configuration_utils.py", line 353, in get_config_dict
    raise EnvironmentError
OSError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "LED_download.py", line 11, in <module>
    tokenizer = AutoTokenizer.from_pretrained("allenai/led-base-16384")
  File "/Users/krishna/opt/anaconda3/envs/CiteKP/lib/python3.8/site-packages/transformers/tokenization_auto.py", line 209, in from_pretrained
    config = AutoConfig.from_pretrained(pretrained_model_name_or_path, **kwargs)
  File "/Users/krishna/opt/anaconda3/envs/CiteKP/lib/python3.8/site-packages/transformers/configuration_auto.py", line 272, in from_pretrained
    config_dict, _ = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
  File "/Users/krishna/opt/anaconda3/envs/CiteKP/lib/python3.8/site-packages/transformers/configuration_utils.py", line 362, in get_config_dict
    raise EnvironmentError(msg)
OSError: Can't load config for 'allenai/led-base-16384'. Make sure that:

- 'allenai/led-base-16384' is a correct model identifier listed on 'https://huggingface.co/models'

- or 'allenai/led-base-16384' is the correct path to a directory containing a config.json file

I think the error is because the latest pretrained model allenai/led-base-16384 is not compatible with the transformers version (i.e. v3.1.0) mentioned in the requirements.txt?