SEACrowd / seacrowd-datahub

A collaborative project to collect datasets in SEA languages, SEA regions, or SEA cultures.
Apache License 2.0
68 stars 57 forks source link

Create dataset loader for XStoryCloze #69

Closed SamuelCahyawijaya closed 8 months ago

SamuelCahyawijaya commented 12 months ago

Dataloader name: xstorycloze/xstorycloze.py DataCatalogue: http://seacrowd.github.io/seacrowd-catalogue/card.html?xstorycloze

Dataset xstorycloze
Description XStoryCloze consists of the professionally translated version of the English StoryCloze dataset (Spring 2016 version) to 10 non-English languages. This dataset is released by Meta AI.
Subsets id, my
Languages ind, mya
Tasks Commonsense Reasoning
License Creative Commons Attribution Share Alike 4.0 (cc-by-sa-4.0)
Homepage https://huggingface.co/datasets/juletxara/xstory_cloze
HF URL https://huggingface.co/datasets/juletxara/xstory_cloze
Paper URL https://aclanthology.org/2022.emnlp-main.616
williamnixon20 commented 12 months ago

self-assign

akhdanfadh commented 12 months ago

self-assign

sabilmakbar commented 11 months ago

Hi @SamuelCahyawijaya, can we remove Tasks of Language Modelling from this one? Since there's no explicit mention of LangMod use case for this dataset.

SamuelCahyawijaya commented 11 months ago

@sabilmakbar : sorry for the late reply. Yes, exactly, we can remove the LangMod task from the list. Sorry for the confusion.

sabilmakbar commented 10 months ago

Hi @SamuelCahyawijaya , can we revise this to COMMONSENSE_REASONING?