SEACrowd / seacrowd-datahub

A collaborative project to collect datasets in SEA languages, SEA regions, or SEA cultures.
Apache License 2.0
64 stars 57 forks source link

Create dataset loader for MedEV #623

Closed SamuelCahyawijaya closed 4 months ago

SamuelCahyawijaya commented 5 months ago

Dataloader name: medev/medev.py DataCatalogue: http://seacrowd.github.io/seacrowd-catalogue/card.html?medev

Dataset medev
Description A high-quality Vietnamese-English parallel dataset constructed specifically for the medical domain, comprising approximately 360K sentence pairs
Subsets -
Languages vie
Tasks Machine Translation
License Unknown (unknown)
Homepage https://huggingface.co/datasets/nhuvo/MedEV
HF URL https://huggingface.co/datasets/nhuvo/MedEV
Paper URL -
patrickamadeus commented 5 months ago

self-assign