Added script createljspeech.py for easy dataset creation

MycroftAI / mimic2

Text to Speech engine based on the Tacotron architecture, initially implemented by Keith Ito.

Apache License 2.0

581 stars 103 forks source link

Added script createljspeech.py for easy dataset creation #45

Closed thorstenMueller closed 4 years ago

thorstenMueller commented 4 years ago

Sorry if this code is not optimized, but it's my first python script, so i tried my best. It extracts information from mimic-recording-studio sqlite db table and creates a ljspeech-1.1 structure which can be used for futher processing.

thorstenMueller commented 4 years ago

Hey Kris. I've updated the pull request on the basis of your feedback.

Now there's a mrs dataset property for native mimic-recording-studio data processing. I also moved/updated createljspeech.py to work without changing environment specific parameters withing the python script.

   python3 preprocess.py --dataset mrs --mrs_dir=<path_to>/mimic-recording-studio/

  python3 ./datasets/createljspeech.py --mrs_dir=<path_to>/mimic-recording-studio/

thorstenMueller commented 4 years ago

I rolled back the commits on Dockerfile and gitignore file and added support for multiple speakers in mimic-recording-studio using same sqlite database.

krisgesling commented 4 years ago

Thanks for persevering and continuing to nudge me Thorsten :)

This is looking great. Merging now.