Sean-Blank / AMRcoref

9 stars 1 forks source link

Tokenizer problem while training using train.py #5

Open TrinaDutta95 opened 2 years ago

TrinaDutta95 commented 2 years ago

I am using the preprocessed data to train and facing the following issue. I understand that it is issue with not finding the model. Any way this can be solved?

PU available: True CuDNN: True Using GPU To Train... GPU ID: 1 Log file path: ./ckpt/coref.amr.log Model name './data/bert-base-cased' was not found in model name list (bert-base-uncased, bert-large-uncased, bert-base-cased, bert-large-cased, bert-base-multilingual-uncased, bert-base-multilingual-cased, bert-base-chinese). We assumed './data/bert-base-cased' was a path or url but couldn't find any file associated to this path or url. load train data Traceback (most recent call last): File "train.py", line 165, in train(args) File "train.py", line 34, in train train_data, dev_data, test_data, vocabs = make_data(args, tokenizer) File "/home/researcher/Trina/SeanBlankAMRcoref-main/dataloader.py", line 504, in make_data train_data = load_json(args.train_data, args, tokenizer) File "/home/researcher/Trina/SeanBlankAMRcoref-main/dataloader.py", line 156, in load_json token_bert_ids = get_bert_ids(tokens, args, tokenizer) File "/home/researcher/Trina/SeanBlankAMRcoref-main/dataloader.py", line 93, in get_bert_ids for char in tokenizer.tokenize(word): AttributeError: 'NoneType' object has no attribute 'tokenize'