r9y9 / nnmnkwii_gallery

A collection of examples demonstrating how we can build speech synthesis systems using nnmnkwii.
https://github.com/r9y9/nnmnkwii
Other
71 stars 21 forks source link

How to prepare dataset? #7

Closed mrgloom closed 5 years ago

mrgloom commented 5 years ago

How to prepare new dataset, lets say from https://keithito.com/LJ-Speech-Dataset/ that have .wav and .txt pairs?

Looking at datasets downloaded via:

./scripts/download_data.sh slt_arctic_demo_data
./scripts/download_data.sh slt_arctic_full_data
data
├── CMU_ARCTIC_COPYING
├── NIT-ATR503_COPYING
├── questions-radio_dnn_416.hed
├── questions_jp.hed
├── slt_arctic_demo_data
└── slt_arctic_full_data
data/slt_arctic_demo_data/
├── label_phone_align
├── label_state_align
├── questions-radio_dnn_416.hed -> /Users/my_user/external_projects/text-to-speech/nnmnkwii_gallery/scripts/../data/questions-radio_dnn_416.hed
└── wav

They contain .lab and .wav files only.

Original cmu_arctic dataset have only one lab folder: http://festvox.org/cmu_arctic/cmu_arctic/cmu_us_bdl_arctic/lab/ So how to generate label_phone_align and label_state_align?

cmu_arctic dataset distribution have make_labs tool, is it used to create *.lab files? http://festvox.org/cmu_arctic/cmu_arctic/cmu_us_bdl_arctic/bin/make_labs

stale[bot] commented 5 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.