Open ghost opened 5 years ago
Where did you get pretrained_transformers/gpt
?
model_checkpoint
is supposed to be a valid argument to OpenAIGPT.from_pretrained
On Mon, Oct 21, 2019 at 8:19 PM Sam Shleifer notifications@github.com wrote:
Where did you get pretrained_transformers/gpt?
model_checkpoint is supposed to be a valid argument to OpenAIGPT.from_pretrained
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/huggingface/transfer-learning-conv-ai/issues/42?email_source=notifications&email_token=AM3GZM3ZG7D5FHWCYKASHSTQPXXBZA5CNFSM4JC7YZM2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEB3JG6I#issuecomment-544641913, or unsubscribe https://github.com/notifications/unsubscribe-auth/AM3GZM2HBL3UUYI6IXZEMQLQPXXBZANCNFSM4JC7YZMQ .
if you're giving any special tokens in the tuple format then this issue will arise, give it in list format. Beside this issue is fixed in master Special Token issue
Hi I am using python 3.6, and I run python train.py --model_checkpoint pretrained_transformers/gpt --dataset_path datasets/personachat_self_original.json thanks
INFO:/dev/ccn/generation/transfer-learning-conv-ai/utils.py:Tokenize and encode the dataset Traceback (most recent call last): File "train.py", line 271, in
train()
File "train.py", line 175, in train
train_loader, val_loader, train_sampler, valid_sampler = get_data_loaders(args, tokenizer)
File "train.py", line 77, in get_data_loaders
personachat = get_dataset(tokenizer, args.dataset_path, args.dataset_cache)
File "/dev/ccn/generation/transfer-learning-conv-ai/utils.py", line 52, in get_dataset
dataset = tokenize(dataset)
File "/dev/ccn/generation/transfer-learning-conv-ai/utils.py", line 50, in tokenize
return dict((n, tokenize(o)) for n, o in obj.items())
File "/dev/ccn/generation/transfer-learning-conv-ai/utils.py", line 50, in
return dict((n, tokenize(o)) for n, o in obj.items())
File "/dev/ccn/generation/transfer-learning-conv-ai/utils.py", line 51, in tokenize
return list(tokenize(o) for o in obj)
File "/dev/ccn/generation/transfer-learning-conv-ai/utils.py", line 51, in
return list(tokenize(o) for o in obj)
File "/dev/ccn/generation/transfer-learning-conv-ai/utils.py", line 50, in tokenize
return dict((n, tokenize(o)) for n, o in obj.items())
File "/dev/ccn/generation/transfer-learning-conv-ai/utils.py", line 50, in
return dict((n, tokenize(o)) for n, o in obj.items())
File "/dev/ccn/generation/transfer-learning-conv-ai/utils.py", line 51, in tokenize
return list(tokenize(o) for o in obj)
File "/dev/ccn/generation/transfer-learning-conv-ai/utils.py", line 51, in
return list(tokenize(o) for o in obj)
File "/dev/ccn/generation/transfer-learning-conv-ai/utils.py", line 48, in tokenize
return tokenizer.convert_tokens_to_ids(tokenizer.tokenize(obj))
File "/libs/anaconda3/envs/transformer36/lib/python3.6/site-packages/pytorch_transformers/tokenization_utils.py", line 490, in tokenize
added_tokens = list(self.added_tokens_encoder.keys()) + self.all_special_tokens
File "/libs/anaconda3/envs/transformer36/lib/python3.6/site-packages/pytorch_transformers/tokenization_utils.py", line 635, in all_special_tokens
all_toks = all_toks + (attr_value if isinstance(attr_value, (list, tuple)) else [attr_value])
TypeError: can only concatenate list (not "tuple") to list