SEACrowd / seacrowd-datahub

A collaborative project to collect datasets in SEA languages, SEA regions, or SEA cultures.
Apache License 2.0
64 stars 57 forks source link

Create dataset loader for VLSP2020 UDP #624

Open SamuelCahyawijaya opened 5 months ago

SamuelCahyawijaya commented 5 months ago

Dataloader name: vlsp2020_udp/vlsp2020_udp.py DataCatalogue: http://seacrowd.github.io/seacrowd-catalogue/card.html?vlsp2020_udp

Dataset vlsp2020_udp
Description A gold universal dependency annotated datasets for Vietnamese and to evaluate dependency parsing systems. The data includes the training data of the VLSP2019 shared task, amounting to more than 8000 annotated training sentences and more than 1000 test sentences. The dataset contains raw and pre-processed format
Subsets -
Languages vie
Tasks Dependency Parsing
License Unknown (unknown)
Homepage https://drive.google.com/drive/folders/1S6v2SFBr8_FI8HxKGOTfmL9zV0YFrpSR?usp=sharing
HF URL -
Paper URL -