Open tomlimi opened 3 months ago
MyT5 is a sister model of ByT5 trained on Morphologically-dervied byte sequences (MYTES). The model itself shares implementation with T5ForConditionalGeneration, so only addition of custom MyT5Tokenizer is needed to run model from hugging-face.
The tokenizer implementation is available at: https://github.com/tomlimi/MYTE/tree/main/src/myt5 MYTEs and MyT5 training are described in a research paper: https://arxiv.org/pdf/2403.10691 Model cards: https://huggingface.co/Tomlim/myt5-large
cc @ArthurZucker
FWY @itazap
Model description
MyT5 is a sister model of ByT5 trained on Morphologically-dervied byte sequences (MYTES). The model itself shares implementation with T5ForConditionalGeneration, so only addition of custom MyT5Tokenizer is needed to run model from hugging-face.
Open source status
Provide useful links for the implementation
The tokenizer implementation is available at: https://github.com/tomlimi/MYTE/tree/main/src/myt5 MYTEs and MyT5 training are described in a research paper: https://arxiv.org/pdf/2403.10691 Model cards: https://huggingface.co/Tomlim/myt5-large