This dataset (referred to as generated_reviews_yn in scb-mt-en-th-2020) are English product reviews generated by CTRL, translated by Google Translate API and annotated as accepted or rejected (correct) based on fluency and adequacy of the translation by human annotators. This allows it to be used for English-to-Thai translation quality esitmation (binary label), machine translation, and sentiment analysis. For SEACrowd, we use data with correct = 1.
Dataloader name:
generated_review_enth/generated_review_enth.py
DataCatalogue: http://seacrowd.github.io/seacrowd-catalogue/card.html?generated_review_enth