SEACrowd / seacrowd-datahub

A collaborative project to collect datasets in SEA languages, SEA regions, or SEA cultures.
Apache License 2.0
55 stars 54 forks source link

Create dataset loader for Okapi m-Hellaswag #478

Open SamuelCahyawijaya opened 4 months ago

SamuelCahyawijaya commented 4 months ago

Dataloader name: okapi_m_hellaswag/okapi_m_hellaswag.py DataCatalogue: http://seacrowd.github.io/seacrowd-catalogue/card.html?okapi_m_hellaswag

Dataset okapi_m_hellaswag
Description m-Hellaswag is a multi-lingual version of Hellaswag, a commonsense inference challenge dataset. Though its questions are trivial for humans (>95% accuracy), state-of-the-art models struggle (
Subsets -
Languages ind, vie
Tasks Question Answering
License Creative Commons Attribution Non Commercial 4.0 (cc-by-nc-4.0)
Homepage http://nlp.uoregon.edu/download/okapi-eval/datasets/
HF URL https://huggingface.co/datasets/jon-tow/okapi_hellaswag
Paper URL https://arxiv.org/abs/2307.16039
tellarin commented 4 months ago

self-assign

tellarin commented 3 months ago

Back working on this. Sorry for the delay.