huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
134.87k stars 26.98k forks source link

LongT5 Summarization Example Not Working #18158

Closed ghost closed 2 years ago

ghost commented 2 years ago

System Info

Who can help?

@patrickvonplaten, @ydshieh, @sgugger

Information

Tasks

Reproduction

git clone --branch v4.16.2-release https://github.com/huggingface/transformers

Example from Transformers/examples/pytorch/summarization, the only change is the --model_name_or_path

python transformers/examples/pytorch/summarization/run_summarization.py \
    --model_name_or_path google/long-t5-tglobal-base \
    --do_train \
    --do_eval \
    --dataset_name cnn_dailymail \
    --dataset_config "3.0.0" \
    --source_prefix "summarize: " \
    --output_dir /tmp/tst-summarization \
    --per_device_train_batch_size=4 \
    --per_device_eval_batch_size=4 \
    --overwrite_output_dir \
    --predict_with_generate

Error:

[INFO|configuration_utils.py:644] 2022-07-16 13:23:14,077 >> loading configuration file https://huggingface.co/google/long-t5-tglobal-base/resolve/main/config.json from cache at /home/good/.cache/huggingface/transformers/1b9067139467923bb0ea7749ceb5694acb0950b479ad1ebe47d9014180af8c31.69c5bfb92a1a084ead5ef0d9c9c9f09bac4f07cfd875433aa8fab59199208a7f
Traceback (most recent call last):
  File "transformers/examples/pytorch/summarization/run_summarization.py", line 698, in <module>
    main()
  File "transformers/examples/pytorch/summarization/run_summarization.py", line 371, in main
    use_auth_token=True if model_args.use_auth_token else None,
  File "/home/good/anaconda3/envs/gpt1/lib/python3.7/site-packages/transformers/models/auto/configuration_auto.py", line 632, in from_pretrained
    config_class = CONFIG_MAPPING[config_dict["model_type"]]
  File "/home/good/anaconda3/envs/gpt1/lib/python3.7/site-packages/transformers/models/auto/configuration_auto.py", line 347, in __getitem__
    raise KeyError(key)
KeyError: 'longt5'

Expected behavior

LongT5 model from google/long-t5-tglobal-base to start training like a normal T5 model (T5-base).

whaleloops commented 2 years ago

update transformer to 4.20.0+ should solve this issue.

Also I don't think you need --source_prefix "summarize: " according to the paper.

ghost commented 2 years ago

@whaleloops It works, thanks.