ajbogh / blather

Python application for speech recognition using pocketsphinx and gstreamer. A GUI is available for both Qt and Gtk.
29 stars 9 forks source link

update from python2 to python3 #18

Open andy5995 opened 7 years ago

andy5995 commented 7 years ago

I saw this request by @piegamesde in #17

I'd be interested in starting to update the py files to be compatible with python3. If I get stuck, can I do a PR and get some help along the way?

ajbogh commented 7 years ago

That would be great! I tried a couple of times and got stuck, then had to put it aside. If you get stuck I will do my best to help.

andy5995 commented 7 years ago

Output as of https://github.com/andy5995/blather/commit/1399dd01bef080da8cbd4fc9b0530dc0caec0090

https://gist.github.com/andy5995/83dbfd2fcf31898cd42c2b7129991ab1

Errors out at L235 (gist above) with the msg

GLib.Error: gst_parse_error: no element "vader" (1)

Some info about that error at

I'm on Debian stretch (9). I installed gstreamer1.0-pocketsphinx (0.10 wasn't available) but received the same result.

I don't know how to proceed from that yet. Any ideas?

ajbogh commented 7 years ago

Feel free to check out the changes I've made in the update_gstreamer branch. I also included a README_Ubuntu.md file so that you can install the right applications and plugins. This file will eventually be merged into the README if the instructions work well.

I should point out that "vader" is no longer a plugin. The gstreamer developers have merged the voice activity detection into the decoder.

There used to be vader element before to do voice activity detection, now voice activity is detected inside decoder for best accuracy.

https://cmusphinx.github.io/wiki/gstreamer/

The changes I made above are a simple hack to get the code working with Python 2.7 on Ubuntu. Your Python 3 code is more in line with what we want, I believe.

andy5995 commented 7 years ago

Great! I'll keep going with the Python 3 code then.

Should I use this method then?

from gi.repository import GObject as gobject

As opposed to changes like I made in my PR?

-class Recognizer(gobject.GObject): +class Recognizer(GObject.GObject)

ajbogh commented 7 years ago

If we're going with pure Python 3 code then I would prefer to not alias variables. It's easier, in my opinion, to understand the code when you don't use aliases, although it's easier to implement as one line rather than change every instance of gobject to GObject. Really though, it's your choice.

andy5995 commented 7 years ago

Really though, it's your choice.

I'll keep going with python3. That seems to make the most sense.

I have no idea how to debug the segfault unfortunately.

I did run python3 Blather.py through gdb, but I'm not great with debuggers. Here's the output of info stack: https://gist.github.com/andy5995/d774c0e3191675ba580b2bf1208547d7

And that's as of https://github.com/ajbogh/blather/pull/21/commits/1adf9e4a3f4ee49689559c2db2e3c59a31f8029a

andy5995 commented 7 years ago

I updated the gist. I was using

gdb `which python`

instead of python3. That changed the output of info stack significantly.

ajbogh commented 7 years ago

I've found that if you comment out the asr.set_property('configured', True) in Recognizer.py then it'll get further before a core dump. It almost starts the app. Maybe some of the settings for pocketsphinx are causing some issues.

ajbogh commented 7 years ago

Well I found out that pocketsphinx on my system is causing the segmentation fault. I'm not sure how to fix it though.

This works: cmd = audio_src+' ! audioconvert ! audioresample'

This doesn't work: cmd = audio_src+' ! audioconvert ! audioresample ! pocketsphinx name=asr'

andy5995 commented 7 years ago

The segfault seems to be fixed

I commented out this line

#       asr.set_property('configured', True)

Current output:

https://gist.github.com/andy5995/0ec1dc24e02a3131b686bb02feb7df2b

andy5995 commented 7 years ago

Got a little further (https://github.com/andy5995/blather/commit/013cff6acd34b1472e125a860dcc9f6e5d99a5fe).

Blather with no args still gives a segfault, but with -i g/q it's problems are a result of Gtk and Qt outdatedness and does not segfault. I was focusing on -g.

andy5995 commented 7 years ago

I'm not sure when or if I'll be able to finish this. I have less experience with Gtk, pyGtk, than I do with python. Makes porting it quite a bit more challenging. I don't mind a challenge, but it may be wise to give someone else the opportunity to finish this.

If you're ok, with that, I thought perhaps it would be a good idea to issue a release of the current master branch, merge in my changes, then create separate tickets for "porting to Qt", "porting to Gtk". Might be easier to review in the long-run actually.

Before I forget again, I have a question: It's my understanding that people can only be assigned to tickets on GitHub if they are project collaborators or members of an organization that a project is under. How did you assign this ticket to me?

ajbogh commented 7 years ago

Thanks for all your help. I think we both got to the same problem and it lies in pocketsphinx IMO. Some option is causing a segfault but I don't know what's wrong.

I'll create a new branch for both of our changes with open issues and descriptions for help wanted.

I've also been considering a fork of this project using NodeJS or electron (or both). There may be available packages that support this functionality.

Regarding assigning issues, I think that since you made a comment on the issue first you became an available contributor. I haven't ever tried to assign an issue to a person who had not already commented or contributed to a project.

andy5995 commented 7 years ago

Thanks for all your help. I think we both got to the same problem and it lies in pocketsphinx IMO. Some option is causing a segfault but I don't know what's wrong.

Hmmm... maybe there is something I'm not understanding. (Although I see I wasn't too clear in my last progress update.) Isn't it significant that we got past the first segfault?

When I removed this line # asr.set_property('configured', True)

Blather got a lot further than before.

screenshot_2017-11-02_17-39-18

then it fails when while running GtkUI.py, with this output

('click right', 'xdotool click 3')
click middle:xdotool click 2 

('click middle', 'xdotool click 2')
Traceback (most recent call last):
  File "./Blather.py", line 287, in <module>
    blather = Blather(options)
  File "./Blather.py", line 71, in __init__
    self.ui = UI(args,opts.continuous)
  File "/home/andy/src/blather/GtkUI.py", line 45, in __init__
    accel.connect_group(Gtk.keysyms.q, Gtk.gdk.CONTROL_MASK, Gtk.ACCEL_VISIBLE, self.accel_quit )
AttributeError: 'AccelGroup' object has no attribute 'connect_group'

What I tried today was removing my ~/.config/blather/sentences.corpus file and running `Blather.py' with no args.

It generated a new sentences.corpus file before it segfaulted.

But when I run Blather with -i -g, I get the output above (connect_group...)

Have you tested the patch yet since I removed # asr.set_property('configured', True) ?

ajbogh commented 7 years ago

I'm researching the speech_recognition library instead of using pocketsphinx directly through gst. It seems to work perfectly in once installed using python -m speech_recognition. It can still be set up to use pocketsphinx as well.

https://github.com/Uberi/speech_recognition/blob/master/reference/pocketsphinx.rst

ajbogh commented 7 years ago

Well the good news is I got SpeechRecognition working. I still have to refactor the code in Recognizer to work with it, but pocketsphinx is working now and it's recognizing words. I can only spend a couple hours at a time on this, so it might be a few more days before I have a working version again. Below is the basic code (very rough), hope this helps.

Also, don't forget to install PocketSphinx using pip. That seems to help. Here's my rough installation instructions. I'll be removing unnecessary things later.

Installation:

sudo apt-get install --reinstall python-gi python-psutil python-gst-1.0 python-gst-1.0-dbg gstreamer1.0-pocketsphinx python-pyaudio swig python-pip python-dev build-essential

pip install PocketSphinx

Code (only new or changed code is included, everything else is the same):

import speech_recognition as sr

r = sr.Recognizer()
m = sr.Microphone()

class Recognizer(gobject.GObject):
    __gsignals__ = {
        'finished' : (gobject.SIGNAL_RUN_LAST, gobject.TYPE_NONE, (gobject.TYPE_STRING,))
    }
    def __init__(self, language_file, dictionary_file, src = None):
        gobject.GObject.__init__(self)

        print("A moment of silence, please...")
        with m as source: r.adjust_for_ambient_noise(source)
        print("Set minimum energy threshold to {}".format(r.energy_threshold))
    def callback(self, recognizer, audio):
        # recognize speech using Sphinx
        try:
            print("Sphinx thinks you said " + recognizer.recognize_sphinx(audio))
        except sr.UnknownValueError:
            print("Sphinx could not understand audio")
        except sr.RequestError as e:
            print("Sphinx error; {0}".format(e))

    def listen(self):
        # self.pipeline.set_state(gst.State.PLAYING)
        # global stop_listening
        # stop_listening = r.listen_in_background(m, self.callback)
        while True:
            print("Say something!")
            with m as source: audio = r.listen(source)
            print("Got it! Now to recognize it...")
            self.callback(r, audio)
        return
ajbogh commented 7 years ago

Here's the latest code. Please feel free to use the example in Recognizer.py for your development.

https://github.com/ajbogh/blather/pull/22/files

Oh, apparently the dictionary file and language files are unnecessary now. Without them pocketsphinx may be more error prone, but I haven't figured out how to configure it to use them.