We currently use two different tokenizers: one for historical language and one for modern language. It seems to be common (or the only possible way?) to use only a single tokenizer if you want to create a huggingface model (transformers.EncoderDecoderModel) including a tokenizer.
We currently use two different tokenizers: one for historical language and one for modern language. It seems to be common (or the only possible way?) to use only a single tokenizer if you want to create a huggingface model (
transformers.EncoderDecoderModel
) including a tokenizer.