calpoly-csai / swanton

Swanton Pacific Ranch chatbot with a knowledge graph
MIT License
3 stars 1 forks source link

Improve Speech-To-Text (ML Approach) #16

Open chidiewenike opened 3 years ago

chidiewenike commented 3 years ago

Objective

The DeepSpeech Speech-To-Text system needs to be improved to handle uncommon & non-English words. The machine learning approach is to retrain the DeepSpeech model with new audio data and analyze the results.

Key Result

Using the run_stt function of stream_deepspeech.py, return a string of audio input that is correctly transcribed. https://github.com/calpoly-csai/swanton/blob/b8e55023e9c12af9dabd8050166fd8cdb8860e91/stream_deepspeech.py#L16

Details

Correctly transcribe all QA pairs from the question-answer pairs Google Sheet. To train a new DeepSpeech model, you can follow these instructions.

You will need the following DeepSpeech model and DeepSpeech scorer to use run_stt.

If in need of assistance, please ask @chidiewenike

Additional Resources

snekiam commented 3 years ago

Here's a (growing) list of words we're having issues with: https://docs.google.com/spreadsheets/d/1rcomLifXhAaMo0zFzv36f1OoWHmShwe3gQKmV-uE71Q/edit?usp=sharing