AI4Bharat / IndicTrans2

Translation models for 22 scheduled languages of India
https://ai4bharat.iitm.ac.in/indic-trans2
MIT License
230 stars 65 forks source link

Change in IndicTransTokenizer to IndicTransToolkit #96

Closed sofia100 closed 2 months ago

sofia100 commented 2 months ago

Hi there, As per this update, https://github.com/VarunGumma/IndicTransToolkit?tab=readme-ov-file#minor-update-v102, tokenizer is changed. Can you please mention the changes needed for importing tokenizer?

VarunGumma commented 2 months ago

The tokenizer is now available in HF with the models. You can visit the model page for a detailed example on how to import the tokenizer from HF. More information also available in the readme of IndicTransToolkit.