Melissa-AI / Melissa-Core

A lovely virtual assistant for OS X, Windows and Linux systems.
MIT License
491 stars 201 forks source link

PocketSphinx is broken #39

Closed tanay1337 closed 8 years ago

tanay1337 commented 8 years ago

Okay, somewhere along the line, we have broken the PocketSphinx integration. Now setting stt to sphinx in profile.json gives me the following traceback:

Traceback (most recent call last):
  File "main.py", line 20, in <module>
    main()
  File "main.py", line 18, in main
    stt(profile_data)
  File "/Users/tanay/Desktop/Melissa-Core/GreyMatter/SenseCells/stt.py", line 75, in stt
    brain(profile_data, sphinx_stt())
  File "/Users/tanay/Desktop/Melissa-Core/GreyMatter/SenseCells/stt.py", line 40, in sphinx_stt
    config.set_string('-hmm', os.path.join(modeldir, hmm))
  File "/usr/local/lib/python2.7/site-packages/sphinxbase/sphinxbase.py", line 137, in set_string
    return _sphinxbase.Config_set_string(self, key, val)
TypeError: in method 'Config_set_string', argument 3 of type 'char const *'

Can't seem to figure out what's wrong, help needed @neilnelson!

neilnelson commented 8 years ago

I did a search on Google for TypeError: type 'char const *' and this page may help.

Looks like the val argument for sphinxbase.py requires 'char const *'. You can check the type of what is sent

temp = os.path.join(modeldir, hmm) print type(temp)

Then we could somehow convert to the required type.

I should get pocketsphinx working here so I can provide a better response.

tanay1337 commented 8 years ago

Thanks would be waiting for your response @neilnelson :)

neilnelson commented 8 years ago

The type of the value from

os.path.join(modeldir, hmm)

was unicode. The following line

config.set_string('-hmm', os.path.join(modeldir, hmm))

is expecting

char const *.

which accepts Python type string. The following is an example of the lines that were added to dictionary value to obtain the string type.

            modeldir = profile_data['pocketsphinx']['modeldir']
            modeldir = modeldir.encode("ascii")

It may be useful to recursively traverse the dictionary, converting the unicode values to ascii when profile.json is loaded in main.py.

def myprint(d):
  for k, v in d.iteritems():
    if isinstance(v, dict):
      myprint(v)
    else:
      print "{0} : {1}".format(k, v)

Python 3 has an easier method at load time.

with open('unicode.txt', encoding='utf-8') as f:
    for line in f:
        print(repr(line))

To get my PocketSphinx to run I uninstalled and removed all PocketSphinx software and then ran the Melissa install. It appears that the Melissa install did not install the PocketSphinx model files as shown here in profile_populator.py.

    modeldir = '/usr/local/share/pocketsphinx/model/'
    hmm = 'en-us/en-us'
    lm = 'lm/2854.lm'
    dic = 'lm/2854.dic'

I downloaded and unzipped pocketsphinx-0.8 and changed the following profile.json lines.

        "hmm": "hmm/en_US/hub4wsj_sc_8k", 
        "lm": "lm/en_US/hub4.5000.DMP", 
        "modeldir": "/mnt/Documents/ai/pocketsphinx-0.8/model/", 
        "dic": "lm/en_US/hub4.5000.dic"

The PocketSphinx path ran without error but did not recognize words much at all. My experience with Sphinx, HTK, and Kaldi (increasing order of quality) indicate that they have a fairly high error rate. The error rate can be reduced by:

Those two methods combined make a quite accurate speech-recognizer. I am quite impressed with how well Google's recognizer is doing. If we had to have our own recognizer, I recommend Kaldi. The other recognizers Jasper is using might be something to check out but older recognizers are not likely useful.

Pull request coming up.

neilnelson commented 8 years ago

More thoughts on speech-recognizers. HTK has a way to adapt their model to a user's voice and vocabulary. Kaldi does not have adaptation like HTK but has good general recognition at about 85%. HTK and Sphinx are no longer being developed. Kaldi is an on-going project. Kaldi and HTK are large packages that would require some set-up and provision of downloads by us.

Given that we would like to have a small, limited capability recognizer along the PocketSphinx line, the method applicable for that kind of recognizer is what is sometimes referred to command and control recognition that uses a small number of single words or very short word sequences that directly key actions. I was not thinking along this line previously but it is a viable alternative and such that we would essentially have a split or dual method with one being the direct command-and-control seen in Jasper and the other going toward a regular speech interpretation that brings in ConceptNet and Natural Language Understanding. We would then select the Melissa processing path according to the recognizer selected by the user.

neilnelson commented 8 years ago

In case my last commit comment is not seen on the pull request.

Changed PocketSphinx recognition sequence

While working on the Jasper-Melissa merge I noticed how Jasper was using PocketSphinx and tried it out for Melissa. The code is simpler and seems to have better accuracy. My sense is that when the vocabulary is restricted to the command words that it may do quite well.

Moved the decoder initialization to outside and before sphinx_stt so that it is only executed once instead of being in the recognition loop. The first initialization when Melissa starts up takes a few seconds, an inconvenient delay, that might be avoided if the initial user prompt in main.py was done after this initialization. At the moment I do not have a suggestion as to how that would be done. Perhaps the merge will provide a solution.

tanay1337 commented 8 years ago

Hi @neilnelson! Thank you so much for the fix. The installation now works like a charm for me, I actually test it using a vocabulary file which I have created and which consists statements that have been taken from USAGE.md built with the help of instructions from this article.

And yes, currently we do not provide the language and dictionary models, the locations have been provided so that the user can place his/her files is those locations. Last but not the least, we are not working on a Jasper-Melissa merge. We are merely learning from the VA systems that have been built before to improve the code and capabilities of Melissa :D

neilnelson commented 8 years ago

@tanay1337, I am glad you are happy with the correction. I added your remarks on how to complete the PocketSphinx installation to the FAQ page.

I appreciate your remark on the Jasper-Melissa merge in that the kind of radical change I am thinking about is certainly outside the usual project flow of which you are appropriately concerned. I should remark that Melissa has rather easily made great headway on an area I have been looking at for years. I have a direction in mind and will work to explain what I am doing as I go along and you may find some of that work useful.

tanay1337 commented 8 years ago

Sure, @neilnelson! Resolved by https://github.com/Melissa-AI/Melissa-Core/pull/42.

tanay1337 commented 8 years ago

Reopened on grounds of https://github.com/Melissa-AI/Melissa-Core/issues/43.

tanay1337 commented 8 years ago

We'll just close this, since this is an unrelated issue.