Closed leezu closed 3 years ago
I think both are using tokenizers but the legacy one is following the API of an older version of HF tokenizers.
The transformers.PreTrainedTokenizer
is "Base class for all slow tokenizers." compared to transformers.PreTrainedTokenizerFast
which is "Base class for all fast tokenizers (wrapping HuggingFace tokenizers library)." so only the latter uses HF tokenizers
We are calling the tokenizers
package directly in the implementation.
LegacyHuggingFaceTokenizer instead of https://github.com/huggingface/tokenizers