TypeError: can only concatenate list (not "tuple") to list

ghost commented 5 years ago

Hi I am using python 3.6, and I run python train.py --model_checkpoint pretrained_transformers/gpt --dataset_path datasets/personachat_self_original.json thanks

INFO:/dev/ccn/generation/transfer-learning-conv-ai/utils.py:Tokenize and encode the dataset Traceback (most recent call last): File "train.py", line 271, in train() File "train.py", line 175, in train train_loader, val_loader, train_sampler, valid_sampler = get_data_loaders(args, tokenizer) File "train.py", line 77, in get_data_loaders personachat = get_dataset(tokenizer, args.dataset_path, args.dataset_cache) File "/dev/ccn/generation/transfer-learning-conv-ai/utils.py", line 52, in get_dataset dataset = tokenize(dataset) File "/dev/ccn/generation/transfer-learning-conv-ai/utils.py", line 50, in tokenize return dict((n, tokenize(o)) for n, o in obj.items()) File "/dev/ccn/generation/transfer-learning-conv-ai/utils.py", line 50, in return dict((n, tokenize(o)) for n, o in obj.items()) File "/dev/ccn/generation/transfer-learning-conv-ai/utils.py", line 51, in tokenize return list(tokenize(o) for o in obj) File "/dev/ccn/generation/transfer-learning-conv-ai/utils.py", line 51, in return list(tokenize(o) for o in obj) File "/dev/ccn/generation/transfer-learning-conv-ai/utils.py", line 50, in tokenize return dict((n, tokenize(o)) for n, o in obj.items()) File "/dev/ccn/generation/transfer-learning-conv-ai/utils.py", line 50, in return dict((n, tokenize(o)) for n, o in obj.items()) File "/dev/ccn/generation/transfer-learning-conv-ai/utils.py", line 51, in tokenize return list(tokenize(o) for o in obj) File "/dev/ccn/generation/transfer-learning-conv-ai/utils.py", line 51, in return list(tokenize(o) for o in obj) File "/dev/ccn/generation/transfer-learning-conv-ai/utils.py", line 48, in tokenize return tokenizer.convert_tokens_to_ids(tokenizer.tokenize(obj)) File "/libs/anaconda3/envs/transformer36/lib/python3.6/site-packages/pytorch_transformers/tokenization_utils.py", line 490, in tokenize added_tokens = list(self.added_tokens_encoder.keys()) + self.all_special_tokens File "/libs/anaconda3/envs/transformer36/lib/python3.6/site-packages/pytorch_transformers/tokenization_utils.py", line 635, in all_special_tokens all_toks = all_toks + (attr_value if isinstance(attr_value, (list, tuple)) else [attr_value]) TypeError: can only concatenate list (not "tuple") to list

sshleifer commented 5 years ago

Where did you get pretrained_transformers/gpt?

model_checkpoint is supposed to be a valid argument to OpenAIGPT.from_pretrained

ghost commented 5 years ago

I dont think this has to be with model,
I got it from the link in the huggingface

On Mon, Oct 21, 2019 at 8:19 PM Sam Shleifer notifications@github.com wrote:

Where did you get pretrained_transformers/gpt?

model_checkpoint is supposed to be a valid argument to OpenAIGPT.from_pretrained

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/huggingface/transfer-learning-conv-ai/issues/42?email_source=notifications&email_token=AM3GZM3ZG7D5FHWCYKASHSTQPXXBZA5CNFSM4JC7YZM2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEB3JG6I#issuecomment-544641913, or unsubscribe https://github.com/notifications/unsubscribe-auth/AM3GZM2HBL3UUYI6IXZEMQLQPXXBZANCNFSM4JC7YZMQ .

anandhperumal commented 5 years ago

if you're giving any special tokens in the tuple format then this issue will arise, give it in list format. Beside this issue is fixed in master Special Token issue

huggingface / transfer-learning-conv-ai

TypeError: can only concatenate list (not "tuple") to list #42