robmsmt / KerasDeepSpeech

A Keras CTC implementation of Baidu's DeepSpeech for model experimentation
GNU Affero General Public License v3.0
242 stars 79 forks source link
asr baidu coreml ctc deep-learning deeplearning deepspeech keras machine-learning neural-network neural-networks nn speech speech-to-text speechrecognition

Keras DeepSpeech

Build Status

Repository for experimenting with different CTC based model designs for ASR. Supports live recording and testing of speech and quickly creates customised datasets using own-voice dataset creation scripts!

OVERVIEW

SETUP

  1. Recommended > use virtualenv installed with python2.7 (3.x untested and will not work with Core ML)
  2. git clone https://github.com/robmsmt/KerasDeepSpeech
  3. pip install -r requirements.txt
  4. Get the data using the import/download scripts in the data folder, LibriSpeech is a good example.
  5. Download the language model (large file) run ./lm/get_lm.sh

RUN

  1. To Train, simply run python run-train.py In order to specify training/validation files use python run-train.py --train_files <csvfile> --valid_files <csvfile> (see run-train for complete arguments list)
  2. To Test, run python run-test.py --test_files <datacsvfile>

CREDIT

  1. Mozilla DeepSpeech
  2. Baidu DS1 & DS2 papers

Licence

The content of this project itself is licensed under the GNU General Public License. Copyright © 2018

Contributing

Have a question? Like the tool? Don't like it? Open an issue and let's talk about it! Pull requests are appreciated!