SamuelCahyawijaya commented 5 months ago

Dataloader name: duolingo_staple_2020/duolingo_staple_2020.py DataCatalogue: http://seacrowd.github.io/seacrowd-catalogue/card.html?duolingo_staple_2020

Dataset	duolingo_staple_2020
Description	This dataset is provided by Duolingo for their Simultaneous Translation and Paraphrase for Language Education (STAPLE) shared task in 2020. It contains English prompts and corresponding sets of plausible translations in five other languages, including Vietnamese. Each prompt is provided with a baseline automatic reference translation from Amazon, as well as some accepted translations with corresponding user response rates used for task scoring.
Subsets	aws_baseline, gold
Languages	eng, vie
Tasks	Machine Translation, Paraphrasing
License	Creative Commons Attribution Non Commercial 4.0 (cc-by-nc-4.0)
Homepage	https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/38OJR6
HF URL	-
Paper URL	https://aclanthology.org/2020.ngt-1.28.pdf

akhdanfadh commented 5 months ago

self-assign

sabilmakbar commented 4 months ago

SEACrowd / seacrowd-datahub