SEACrowd / seacrowd-datahub

A collaborative project to collect datasets in SEA languages, SEA regions, or SEA cultures.
Apache License 2.0
65 stars 57 forks source link

Create dataset loader for VLSP2016-NER #345

Closed SamuelCahyawijaya closed 8 months ago

SamuelCahyawijaya commented 8 months ago

Dataloader name: vlsp2016_ner/vlsp2016_ner.py DataCatalogue: http://seacrowd.github.io/seacrowd-catalogue/card.html?vlsp2016_ner

Dataset vlsp2016_ner
Description This dataset is collected from electronic newspapers published on the web and provided by VLSP organization. It consists of approximately 15k sentences, each of which contain NE information in the IOB annotation format.
Subsets -
Languages vie
Tasks Named Entiy Recognition
License Creative Commons Attribution Non Commercial 4.0 (cc-by-nc-4.0)
Homepage https://huggingface.co/datasets/datnth1709/VLSP2016-NER-data
HF URL https://huggingface.co/datasets/datnth1709/VLSP2016-NER-data
Paper URL https://drive.google.com/file/d/18FuXxRM0slTeReQUCOj8IiToB5eqVQCT/view
luckysusanto commented 8 months ago

self-assign