SamuelCahyawijaya commented 3 months ago

Dataset	medisco
Description	MEDISCO is a Medical Indonesian Speech Corpus. The medical text corpus is collected from five Indonesian online medical consultation websites. From the text corpus, we created a speech corpus that consists of 360 sentences read by 13 speakers. In total, our speech corpus contains 731 medical terms and consists of 4,680 utterances with a total duration of 10 hours.
Subsets	Train, Test
Languages	ind
Tasks	Automatic Speech Recognition
License	GNU General Public License v3.0 (gpl-3.0)
Homepage	https://huggingface.co/datasets/mrqorib/MEDISCO
HF URL	https://huggingface.co/datasets/mrqorib/MEDISCO
Paper URL	https://ieeexplore.ieee.org/abstract/document/8629259

akhdanfadh commented 3 months ago

self-assign

mrqorib commented 3 months ago

self-assign

@akhdanfadh Sorry would you mind giving this to me? This is my dataset 😆

akhdanfadh commented 3 months ago

Sure! @mrqorib

mrqorib commented 3 months ago

@akhdanfadh Thanks! 😊

SEACrowd / seacrowd-datahub