Uberi / speech_recognition

Speech recognition module for Python, supporting several engines and APIs, online and offline.
https://pypi.python.org/pypi/SpeechRecognition/
BSD 3-Clause "New" or "Revised" License
8.41k stars 2.4k forks source link

speech recognition training #173

Closed petolachka closed 7 years ago

petolachka commented 7 years ago

Steps to reproduce

  1. (How do you make the issue happen? Does it happen every time you try it?)
  2. (Make sure to go into as much detail as needed to reproduce the issue. Posting your code here can help us resolve the problem much faster!)
  3. (If there are any files, like audio recordings, don't forget to include them.)

Expected behaviour

(What did you expect to happen?)

Actual behaviour

(What happened instead? How is it different from what you expected?)

(If the library threw an exception, paste the full stack trace here)

System information

(Delete all the statements that don't apply.)

My system is . (For example, "Ubuntu 16.04 LTS x64", "Windows 10 x64", or "macOS Sierra".)

My Python version is . (You can check this by running python -V.)

My SpeechRecognition library version is . (You can check this by running python -c "import speech_recognition as sr;print(sr.__version__)".)

My PyAudio library version is / I don't have PyAudio installed. (You can check this by running python -c "import pyaudio as p;print(p.__version__)".)

I installed PocketSphinx from . (For example, from the Debian repositories, from Homebrew, or from the source code.)

Hello professionals, I am happy to use your wonderful library. However I have small issue. Some words are not get correctly. For example when I say "home" sometimes the library understands it as "dog". Is there any way to train the system for particular words? Thanks and good luck!

Uberi commented 7 years ago

Hi @petolachka,

There are multiple options for your case:

kubark42 commented 7 years ago

@Uberi This is a great library and I was able to demonstrate speech recognition's potential very easily. However, we need to consider offline usage as our primary paradigm so only Sphinx will work.

Straight out of the box, Sphinx doesn't seem to work as well as we could hope. The Sphinx instructions linked to above are perhaps overkill, in the sense that they educate the reader about all possibilities when it's probably better to go with just one suboptimal possibility in exchange for ease of implementation.

Do you have a minimal example you could post which shows how to do some basic tweaking via keywords?

kubark42 commented 7 years ago

Let me P.S. that by saying that I think there's a cool way to train Sphinx, which is

  1. Use standard services when there's an online connection
  2. Save the resulting words, as a means of compiling the dictionary of user interactions
  3. Feed that word list into http://www.speech.cs.cmu.edu/tools/lmtool-new.html, which would generate a new knowledge base.
  4. Use that knowledge base with Sphinx when offline.

It's not a slam dunk way to have Sphinx understand everything, but it could work well enough to bootstrap things.

Would SpeechRecognition support this approach?