AI4Bharat / IndicTrans2

Translation models for 22 scheduled languages of India
https://ai4bharat.iitm.ac.in/indic-trans2
MIT License
214 stars 59 forks source link

Data formatting #52

Closed datha29 closed 5 months ago

datha29 commented 5 months ago

Hi Team

I am looking at fine tuning the model as per our used case.Would liike to know if meta data is required for fine tuning or not.Also whatt is the minimum number of pairs expected for fine tuning.And thirdly since I am looking at fine tuning the model across 12 Indian languages.Do we require translations for a given english text across all Indian languages

prajdabre commented 5 months ago
  1. No meta data is required.
  2. Depends on your use case. More pairs = Better
  3. No

Side note: All this information is in the paper.