Hi! If I recall correctly, we just need support for multilingual tokenizer.
You can look at how original multilingual tokenizer works and how it is used then.
I think once encoding/decoding works for multiple languages it should be enough to just load the respective model.
Thanks!
Indeed, there wasn't much to do. Only handle the last token of the multilingual token vocab, which the Base64 didn't like.
I also added support for the new large-v3 model.
Nice work! What is needed to support multilingual model? I might give it a try.