AI4Bharat / indicnlp_catalog

A collaborative catalog of NLP resources for Indic languages
https://ai4bharat.github.io/indicnlp_catalog
532 stars 77 forks source link

WMT23 QE datasets #241

Closed anoopkunchukuttan closed 6 months ago

anoopkunchukuttan commented 6 months ago

WMT23 released QE datasets for 5 Indian languages in En to Indic direction. These are: en-mr, en-hi, en-gu, en-ta, en-te. The references are also available, so these can also be used for reference based metrics.

Details here: WMT23 QE Task Report Dataset

For Marathi, post-edits are also available as are word-level annotations error annotations are also available.

Picture1