Closed mozharovsky closed 4 years ago
This PR updates 🤗/transformers to version 3.4.0 and restructures the tokenization block around fast tokenizers.
TransformerTokenizerModule
ByteLevelBPETokenizerModule
TransformerTokenizerFast
ByteLevelBPETokenizerFast
transformer-tokenizer-fast
byte-level-bpe-tokenizer
CodeTokenizerFast
CodeBBPETokenizerFast
CodeTokenizerModule
CodeBBPETokenizerModule
code-tokenizer-fast
code-bbpe-tokenizer
Next we're gonna rework the pre-trained tokenizers (de)serialization across the library. More details to come.
Summary
This PR updates 🤗/transformers to version 3.4.0 and restructures the tokenization block around fast tokenizers.
Patch Notes
TransformerTokenizerModule
->ByteLevelBPETokenizerModule
TransformerTokenizerModule
TransformerTokenizerFast
->ByteLevelBPETokenizerFast
transformer-tokenizer-fast
->byte-level-bpe-tokenizer
CodeTokenizerFast
->CodeBBPETokenizerFast
CodeTokenizerModule
->CodeBBPETokenizerModule
code-tokenizer-fast
->code-bbpe-tokenizer