SEACrowd / seacrowd-datahub

A collaborative project to collect datasets in SEA languages, SEA regions, or SEA cultures.
Apache License 2.0
64 stars 57 forks source link

Create dataset loader for Seahorse #211

Closed SamuelCahyawijaya closed 6 months ago

SamuelCahyawijaya commented 9 months ago

Dataloader name: seahorse/seahorse.py DataCatalogue: http://seacrowd.github.io/seacrowd-catalogue/card.html?seahorse

Dataset seahorse
Description SEAHORSE is a dataset for multilingual, multifaceted summarization evaluation. It consists of 96K summaries with human ratings along 6 quality dimensions: comprehensibility, repetition, grammar, attribution, main idea(s), and conciseness, covering 6 languages, 9 systems and 4 datasets.
Subsets -
Languages vie
Tasks Summarization
License Creative Commons Attribution 4.0 (cc-by-4.0)
Homepage https://storage.googleapis.com/seahorse-public/seahorse_data.zip
HF URL -
Paper URL https://aclanthology.org/2023.emnlp-main.584
chenxwh commented 9 months ago

self-assign

github-actions[bot] commented 8 months ago

Hi, may I know if you are still working on this issue? Please let @holylovenia @SamuelCahyawijaya @sabilmakbar know if you need any help.

sabilmakbar commented 7 months ago

Hi @chenxwh, may we know the update on this dataloader issue? It's been 3 weeks since the last poke from the SEACrowd stale-checker, and we might consider unassigning if there's no progress update in the next 24 hours.

chenxwh commented 7 months ago

https://github.com/SEACrowd/seacrowd-datahub/pull/407