SEACrowd / seacrowd-datahub

A collaborative project to collect datasets in SEA languages, SEA regions, or SEA cultures.
Apache License 2.0
68 stars 57 forks source link

Create dataset loader for XCOPA #6

Closed SamuelCahyawijaya closed 9 months ago

SamuelCahyawijaya commented 1 year ago

Dataloader name: xcopa/xcopa.py DataCatalogue: http://seacrowd.github.io/seacrowd-catalogue/card.html?xcopa

Dataset xcopa
Description XCOPA: A Multilingual Dataset for Causal Commonsense Reasoning The Cross-lingual Choice of Plausible Alternatives dataset is a benchmark to evaluate the ability of machine learning models to transfer commonsense reasoning across languages. The dataset is the translation and reannotation of the English COPA (Roemmele et al. 2011) and covers 11 languages from 11 families and several areas around the globe. The dataset is challenging as it requires both the command of world knowledge and the ability to generalise to new languages.
Subsets th, id, vi
Languages tha, vie, ind
Tasks Commonsense Reasoning
License Creative Commons Attribution 4.0 (cc-by-4.0)
Homepage https://github.com/cambridgeltl/xcopa
HF URL https://huggingface.co/datasets/xcopa
Paper URL https://aclanthology.org/2020.emnlp-main.185/
FawwazMayda commented 1 year ago

self-assign

sabilmakbar commented 11 months ago

Hi @FawwazMayda , may I know the current status of this dataloader creation? Feel free to discuss in here if you have any difficulties, thx!

github-actions[bot] commented 11 months ago

Hi, may I know if you are still working on this issue? Please let @holylovenia @SamuelCahyawijaya @sabilmakbar know if you need any help.

FawwazMayda commented 11 months ago

Hi currently continue to do the implementation I will heads up to you again if I found some issues. Thx

holylovenia commented 11 months ago

Okay then, @FawwazMayda. Feel free to let us know if you need any help!

github-actions[bot] commented 10 months ago

Hi, may I know if you are still working on this issue? Please let @holylovenia @SamuelCahyawijaya @sabilmakbar know if you need any help.

FawwazMayda commented 10 months ago

I have created the MR: https://github.com/SEACrowd/seacrowd-datahub/pull/286

Kindly need help to review cc: @sabilmakbar

sabilmakbar commented 10 months ago

Hi @FawwazMayda, thanks for submitting a PR to this dataloader. Kindly wait for the reviewing process as it's on a queue now; we'll try to get things done ASAP, so you may create another data loader if there's any in your backlog. Thanks!