Closed sarahwie closed 4 years ago
Hi @sarahwie, thank you for reporting this. The support for tokenizers==0.8.0
has been added to transformers
from the 3.0.0
that we just released.
Version info: transformers==3.0.0 tokenizers==0.8.0
When I try to save BertWordPieceTokenizer tokenizer after training
tokenizer.save("./", "data")
I got this error
/usr/local/lib/python3.6/dist-packages/tokenizers/implementations/base_tokenizer.py in save(self, path, pretty)
330 A path to the destination Tokenizer file
331 """
--> 332 return self._tokenizer.save(path, pretty)
333
334 def to_str(self, pretty: bool = False):
TypeError:
May you help please?
tokenizer.save("./tokenizer.json")
Hi @n1t0,
I happen to be having the same issue with tokenizer.save when saving a trained tokenizer.
tokenizer = ByteLevelBPETokenizer()
tokenizer.train(files='/content/drive/My Drive/Project De Novo/Molecule Transformer/pubchem/shard_00.txt', vocab_size=52_000, min_frequency=2, special_tokens=[
"<s>",
"<pad>",
"</s>",
"<unk>",
"<mask>",
])
Any advice on how to fix this issue?
Version info:
transformers==2.9.1
tokenizers==0.8.0.dev2