Create dataset loader for SU-ID TTS

NusaCatalogue: https://indonlp.github.io/nusa-catalogue/card.html?su_id_tts

Dataset	su_id_tts
Description	This data set contains high-quality transcribed audio data for Sundanese. The data set consists of wave files, and a TSV file. The file line_index.tsv contains a filename and the transcription of audio in the file. Each filename is prepended with a speaker identification number. The data set has been manually quality checked, but there might still be errors. This dataset was collected by Google in collaboration with Universitas Pendidikan Indonesia.
License	CC-BY-SA 4.0

Dataset

su_id_tts

Description

This data set contains high-quality transcribed audio data for Sundanese. The data set consists of wave files, and a TSV file. The file line_index.tsv contains a filename and the transcription of audio in the file. Each filename is prepended with a speaker identification number. The data set has been manually quality checked, but there might still be errors. This dataset was collected by Google in collaboration with Universitas Pendidikan Indonesia.

License

CC-BY-SA 4.0

IndoNLP / nusa-crowd

Create dataset loader for SU-ID TTS #281

self-assign