Improve Speech-To-Text (ML Approach)

Objective

The DeepSpeech Speech-To-Text system needs to be improved to handle uncommon & non-English words. The machine learning approach is to retrain the DeepSpeech model with new audio data and analyze the results.

Key Result

Using the run_stt function of stream_deepspeech.py, return a string of audio input that is correctly transcribed. https://github.com/calpoly-csai/swanton/blob/b8e55023e9c12af9dabd8050166fd8cdb8860e91/stream_deepspeech.py#L16

Details

Correctly transcribe all QA pairs from the question-answer pairs Google Sheet. To train a new DeepSpeech model, you can follow these instructions.

You will need the following DeepSpeech model and DeepSpeech scorer to use run_stt.

If in need of assistance, please ask @chidiewenike

Additional Resources

DeepSpeech Github repo: https://github.com/mozilla/DeepSpeech
Training the model: https://medium.com/visionwizard/train-your-own-speech-recognition-model-in-5-simple-steps-512d5ac348a5
Learning Git: https://git-scm.com/book/en/v2

calpoly-csai / swanton

Improve Speech-To-Text (ML Approach) #16

Objective

Key Result

Details

Additional Resources