Closed avi-otterai closed 3 years ago
Note that openai-gpt
has a max_length of 512. See under n_positions
in the config here: https://huggingface.co/openai-gpt/blob/main/config.json.
The run_clm.py
script however sets max_length
to 1024 by default. To fix your bug you should run:
python transformers/examples/pytorch/language-modeling/run_clm.py --model_name_or_path openai-gpt --dataset_name wikitext --dataset_config_name wikitext-2-raw-v1 --do_train --do_eval --output_dir /tmp/test-clm --per_device_train_batch_size 2 --gradient_accumulation_steps 4 --block_size 512
Actually, it's weird that you get this error since:
from transformers import OpenAIGPTTokenizer
tokenizer = OpenAIGPTTokenizer.from_pretrained("openai-gpt")
tokenizer.model_max_length # prints 512
=> so the block size should have automatically been correctly set
There is a small bug with a line not properly indented, fixing.
Environment info
transformers
version: 4.7.0.dev0Who can help
Information
Model I am using (Bert, XLNet ...): openai-gpt
The problem arises when using:
The tasks I am working on is:
To reproduce
Steps to reproduce the behaviour:
Expected behaviour
Should not have a mismatch in tensor shapes. Apparently, the max length of tokens do not match: position embeddings expect 512 but input embeddings are 1024.