AI4Bharat / IndicTrans2

Translation models for 22 scheduled languages of India
https://ai4bharat.iitm.ac.in/indic-trans2
MIT License
214 stars 59 forks source link

ctranslate related issue #59

Closed pr509 closed 5 months ago

pr509 commented 5 months ago

ct2-fairseq-converter --model_path model.pt --data_dir data-bin/ --output_dir ct2_model

in this what should i give in place of --data_dir can you plz explain me .

PranjalChitale commented 5 months ago

--data_dir Path to the the directory where in you have the respective fairseq dictionaries (final_bin directory).

pr509 commented 5 months ago

thanks . just one more thing i want confirm sir could we change the finetuned fairseq model in indictrans1 as ctranslate .is it possibe ?

PranjalChitale commented 5 months ago

Yes, it is possible to port fine-tuned variants of IndicTrans1 to CT2 as well, however the IndicTrans1 model has now been deprecated.

Therefore, we would encourage you to consider using / fine-tuning distilled variants of IndicTrans2 in case you are looking for a light-weight alternative, as IndicTrans2 distilled models have fewer parameters (~200M) than IndicTrans1 (~434M) and are superior to IndicTrans1 in terms of performance.

pr509 commented 5 months ago

thanks for your help .