IndoNLP / nusa-crowd

A collaborative project to collect datasets in Indonesian languages.
Apache License 2.0
260 stars 60 forks source link

Create dataset loader for INDspeech_TELDIALOG_SVCSR #277

Closed SamuelCahyawijaya closed 2 years ago

SamuelCahyawijaya commented 2 years ago

NusaCatalogue: https://indonlp.github.io/nusa-catalogue/card.html?indspeech_teldialog_svcsr

Dataset indspeech_teldialog_svcsr
Description INDspeech_TELDIALOG_SVCSR is the first Indonesian speech dataset for small vocabulary continuous speech recognition (SVCSR). The data was developed by TELKOMRisTI (R&D Division, PT Telekomunikasi Indonesia) in collaboration with Advanced Telecommunication Research Institute International (ATR) Japan and Bandung Institute of Technology (ITB) under the Asia-Pacific Telecommunity (APT) project in 2004 [Sakti et al., 2004]. Although it was originally developed for a telecommunication system for hearing and speaking impaired people, it can be used for other applications, i.e., automatic call centers. Furthermore, as all speakers utter the same sentences, it can also be used for voice conversion tasks.
License CC-BY-NC-SA 4.0
jensan-1 commented 2 years ago

self-assign