SEACrowd / seacrowd-datahub

A collaborative project to collect datasets in SEA languages, SEA regions, or SEA cultures.
Apache License 2.0
55 stars 54 forks source link

Create dataset loader for VLSP2020 RelEx #630

Open SamuelCahyawijaya opened 3 months ago

SamuelCahyawijaya commented 3 months ago

Dataloader name: vlsp2020_relex/vlsp2020_relex.py DataCatalogue: http://seacrowd.github.io/seacrowd-catalogue/card.html?vlsp2020_relex

Dataset vlsp2020_relex
Description The dataset focuses on classifying entity pairs in Vietnamese News text into four different, non-overlapping categories of semantic relations defined in advance. The dataset contains 1,056 documents and 5,900 instances of semantic relations, collected from Vietnamese News in several domains. The dataset was human-annotated and used for VLSP2020 shared task.
Subsets -
Languages vie
Tasks Relation Extraction
License Unknown (unknown)
Homepage https://docs.google.com/document/d/1082jvKOA6Rx_tkqDhy6DZORUXqeaiSWz/edit
HF URL -
Paper URL -