IBM / Train-Custom-Speech-Model

Create a custom Watson Speech to Text model using specialized domain data
https://developer.ibm.com/patterns/customize-and-continuously-train-your-own-watson-speech-service/
Apache License 2.0
59 stars 42 forks source link

Custom Speech Recognition Model #88

Closed santosh9sanjeev closed 4 years ago

santosh9sanjeev commented 4 years ago

How can I build a speech recognition model in Python using the ezDi dataset without IBM STT? My aim is to build a speech recognition model that can be able to recognize medical terms with more efficiency and at the same time using the normal day to day conversation language model How can I build it from scratch in Python?

tonanhngo commented 4 years ago

Hello Santosh, The challenge with building a speech model from scratch is that it is very expensive in terms of the computing resources and the big speech dataset you need to obtain. This is why commercial models are still more practical in terms of being ready to use. Typically you would need to train a model with normal conversation and then use transfer learning to further train the model to recognize medical speech. The ezDI dataset as provided is very small and is intended as a sample to show how transfer learning work on top of a trained model. It is not big enough to train by itself from scratch. In terms of models, there are some pre-trained models that are publicly available. Nvidia provides some trained models on which they have reported very good performance. Here are some links that may be useful: https://developer.nvidia.com/nvidia-nemo https://developer.nvidia.com/blog/announcing-nemo-fast-development-of-speech-and-language-models/ https://ngc.nvidia.com/catalog/models?orderBy=modifiedDESC&query=nemo&quickFilter=models&filters= https://nvidia.github.io/NeMo/asr/models.html These do require specific python packages, but you should be able to try ezDI on these models. If you do, we would be interested in hearing about your experience. Ton,

santosh9sanjeev commented 4 years ago

Thank you very much Sir!