huggingface / transfer-learning-conv-ai

🦄 State-of-the-Art Conversational AI with Transfer Learning
MIT License
1.74k stars 430 forks source link

I can't replicate the training #92

Closed albusdemens closed 3 years ago

albusdemens commented 3 years ago

Hi, first of all hats off for your work and for the nice blog post! I have an Ubuntu VM with 4 V100s on AWS; if I try to replicate your training (command: python train.py) I get the error below. Do you have suggestions on how to fix this?

Here is the error:

INFO:transformers.modeling_utils:loading weights file https://s3.amazonaws.com/models.huggingface.co/bert/openai-gpt-pytorch_model.bin from cache at /home/ubuntu/.cache/torch/transformers/e45ee1afb14c5d77c946e66cb0fa70073a77882097a1a2cefd51fd24b172355e.e7ee3fcd07c695a4c9f31ca735502c090230d988de03202f7af9ebe1c3a4054c
INFO:transformers.tokenization_utils:Adding <bos> to the vocabulary
INFO:transformers.tokenization_utils:Assigning <bos> to the bos_token key of the tokenizer
INFO:transformers.tokenization_utils:Adding <eos> to the vocabulary
INFO:transformers.tokenization_utils:Assigning <eos> to the eos_token key of the tokenizer
INFO:transformers.tokenization_utils:Adding <pad> to the vocabulary
INFO:transformers.tokenization_utils:Assigning <pad> to the pad_token key of the tokenizer
INFO:transformers.tokenization_utils:Adding <speaker1> to the vocabulary
INFO:transformers.tokenization_utils:Adding <speaker2> to the vocabulary
INFO:transformers.tokenization_utils:Assigning ['<speaker1>', '<speaker2>'] to the additional_special_tokens key of the tokenizer
INFO:train.py:Prepare datasets
INFO:/home/ubuntu/transfer-learning-conv-ai/utils.py:Load tokenized dataset from cache at ./dataset_cache_OpenAIGPTTokenizer
INFO:train.py:Build inputs and labels
INFO:train.py:Pad inputs and convert to Tensor
Traceback (most recent call last):
  File "train.py", line 267, in <module>
    train()
  File "train.py", line 171, in train
    train_loader, val_loader, train_sampler, valid_sampler = get_data_loaders(args, tokenizer)
  File "train.py", line 98, in get_data_loaders
    dataset = pad_dataset(dataset, padding=tokenizer.convert_tokens_to_ids(SPECIAL_TOKENS[-1]))
  File "train.py", line 43, in pad_dataset
    max_l = max(len(x) for x in dataset["input_ids"])
ValueError: max() arg is an empty sequence