zaf / asterisk-speech-recog

Speech recognition script for Asterisk that uses google's speech engine.
GNU General Public License v2.0
245 stars 131 forks source link

I think google's speech recognition API changed #9

Closed issackelly closed 10 years ago

issackelly commented 10 years ago

My calls aren't currently processing, This might be the culprit.

http://www.reddit.com/r/raspberry_pi/comments/251yur/google_speech_api_has_changed/

theqkash commented 10 years ago

I have to confirm that. All the answers for requests is -1 .

zaf commented 10 years ago

Seems like they are now blocking free use of the API, the request returns this: 'Your client has issued a malformed or illegal request. Missing parameter: key' Trying with 'speech-api/v2/recognize?' and adding a key parameter with a key obtained from Google API console, recognition is working. I suppose we were expecting this to happen sooner or later. I will have a closer look and see what we can do, but I'm afraid this is now turning into a paid service. I will push an updated version of the script soon with the required changes to support the second version of the API.

zaf commented 10 years ago

To get a key for the speech API see my last comment here:

https://github.com/zaf/asterisk-googletts/issues/10#issuecomment-38970334

issackelly commented 10 years ago

Thanks. I'm trying to work up a patch right now, but my version is already a sort of mangled version of this.

issackelly commented 10 years ago

I've gotten a browser key, and made the following changes:

< my $host       = "www.google.com/speech-api/v2/recognize";
---
> my $host       = "www.google.com/speech-api/v1/recognize";
207c207
< $url .= "?key=A(my key here)lIH5-vNeiHo&xjerr=1&client=chromium&lang=$language&pfilter=$pro_filter&lm=$grammar&maxresults=$results";
---
> $url .= "?xjerr=1&client=chromium&lang=$language&pfilter=$pro_filter&lm=$grammar&maxresults=$results";

I'm still getting nothing useful

zaf commented 10 years ago

Keep in mind that the formatting of the results have also changed quite a bit since v1 of the API. It now looks something like this:

{"result":[]} {"result":[{"alternative":[{"transcript":"this is a test","confidence":0.97335243}],"final":true}],"result_index":0}

so apart from changing the request you also have to parse the result correctly.

issackelly commented 10 years ago

Yeah, I just found that after digging through this: https://github.com/gillesdemey/google-speech-v2. It's not even totally valid JSON, but each line appears to be valid json.

issackelly commented 10 years ago

my @lines = split /\n/, $uaresponse->content; foreach my $line (@lines) { $utterance .= $line; }

This is only giving {"result":[]} as the result from $uaresponse->content, but curl is giving me both lines (after a short delay between lines); Unfortunately my perl knowledge is basically nonexistent.

mikeybs commented 10 years ago

I just started playing with this a few weeks ago and was really enjoying the functionality, great job on a great script!

I am very interested in getting this working again... can anyone provide more complete instructions on how to fix?

zaf commented 10 years ago

The details are already mentioned above.The new APi version requires a KEY and it returns some sort of JSON data that differs from the previous version. There will be an updated version of the asterisk/cli scripts soon.

zaf commented 10 years ago

There is a new version out with support for version 2 of the Speech API. Both agi and cli scripts were updated. Please test and report. Notice that since the values the API returns have changed so did the values that the agi script returns. The most noticeable change is that the script no longer returns a status value. Please check the README files for the updated info and examples and change your dialplan code accordingly.

prevoip commented 10 years ago

updated the speech-recog.agi file and installed perl-libjson that was what I needed to version 2. But still resulting "-1". Follow my debug below:

res_agi.c: -- Launched AGI Script /var/lib/asterisk/agi-bin/speech-recog.agi [2014-05-28 11:29:23] VERBOSE[5922] res_agi.c: <SIP/hg-000001c4>AGI Tx >> agi_request: speech-recog.agi [2014-05-28 11:29:23] VERBOSE[5922] res_agi.c: <SIP/hg-000001c4>AGI Tx >> agi_channel: SIP/hg-000001c4 [2014-05-28 11:29:23] VERBOSE[5922] res_agi.c: <SIP/hg-000001c4>AGI Tx >> agi_language: pt_BR [2014-05-28 11:29:23] VERBOSE[5922] res_agi.c: <SIP/hg-000001c4>AGI Tx >> agi_type: SIP [2014-05-28 11:29:23] VERBOSE[5922] res_agi.c: <SIP/hg-000001c4>AGI Tx >> agi_uniqueid: 1401287363.626 [2014-05-28 11:29:23] VERBOSE[5922] res_agi.c: <SIP/hg-000001c4>AGI Tx >> agi_version: 1.8.17.0 [2014-05-28 11:29:23] VERBOSE[5922] res_agi.c: <SIP/hg-000001c4>AGI Tx >> agi_callerid: 1234 [2014-05-28 11:29:23] VERBOSE[5922] res_agi.c: <SIP/hg-000001c4>AGI Tx >> agi_calleridname: 1234 [2014-05-28 11:29:23] VERBOSE[5922] res_agi.c: <SIP/hg-000001c4>AGI Tx >> agi_callingpres: 0 [2014-05-28 11:29:23] VERBOSE[5922] res_agi.c: <SIP/hg-000001c4>AGI Tx >> agi_callingani2: 0 [2014-05-28 11:29:23] VERBOSE[5922] res_agi.c: <SIP/hg-000001c4>AGI Tx >> agi_callington: 0 [2014-05-28 11:29:23] VERBOSE[5922] res_agi.c: <SIP/hg-000001c4>AGI Tx >> agi_callingtns: 0 [2014-05-28 11:29:23] VERBOSE[5922] res_agi.c: <SIP/hg-000001c4>AGI Tx >> agi_dnid: 11 [2014-05-28 11:29:23] VERBOSE[5922] res_agi.c: <SIP/hg-000001c4>AGI Tx >> agi_rdnis: unknown [2014-05-28 11:29:23] VERBOSE[5922] res_agi.c: <SIP/hg-000001c4>AGI Tx >> agi_context: internal [2014-05-28 11:29:23] VERBOSE[5922] res_agi.c: <SIP/hg-000001c4>AGI Tx >> agi_extension: 11 [2014-05-28 11:29:23] VERBOSE[5922] res_agi.c: <SIP/hg-000001c4>AGI Tx >> agi_priority: 2 [2014-05-28 11:29:23] VERBOSE[5922] res_agi.c: <SIP/hg-000001c4>AGI Tx >> agi_enhanced: 0.0 [2014-05-28 11:29:23] VERBOSE[5922] res_agi.c: <SIP/hg-000001c4>AGI Tx >> agi_accountcode: [2014-05-28 11:29:23] VERBOSE[5922] res_agi.c: <SIP/hg-000001c4>AGI Tx >> agi_threadid: 139662740350720 [2014-05-28 11:29:23] VERBOSE[5922] res_agi.c: <SIP/hg-000001c4>AGI Tx >> agi_arg_1: pt-BR [2014-05-28 11:29:23] VERBOSE[5922] res_agi.c: <SIP/hg-000001c4>AGI Tx >> [2014-05-28 11:29:23] VERBOSE[5922] res_agi.c: <SIP/hg-000001c4>AGI Rx << SET VARIABLE "utterance" "-1" [2014-05-28 11:29:23] VERBOSE[5922] res_agi.c: <SIP/hg-000001c4>AGI Tx >> 200 result=1 [2014-05-28 11:29:23] VERBOSE[5922] res_agi.c: <SIP/hg-000001c4>AGI Rx << SET VARIABLE "confidence" "-1" [2014-05-28 11:29:23] VERBOSE[5922] res_agi.c: <SIP/hg-000001c4>AGI Tx >> 200 result=1 [2014-05-28 11:29:23] VERBOSE[5922] res_agi.c: <SIP/hg-000001c4>AGI Rx << CHANNEL STATUS [2014-05-28 11:29:23] VERBOSE[5922] res_agi.c: <SIP/hg-000001c4>AGI Tx >> 200 result=6 [2014-05-28 11:29:23] VERBOSE[5922] res_agi.c: <SIP/hg-000001c4>AGI Rx << GET FULL VARIABLE ${CHANNEL(audionativeformat)} [2014-05-28 11:29:23] VERBOSE[5922] res_agi.c: <SIP/hg-000001c4>AGI Tx >> 200 result=1 (ulaw) [2014-05-28 11:29:23] VERBOSE[5922] res_agi.c: <SIP/hg-000001c4>AGI Rx << RECORD FILE /tmp/stt_rIQskI sln "#" "-1" BEEP "s=2" [2014-05-28 11:29:23] VERBOSE[5922] file.c: -- <SIP/hg-000001c4> Playing 'beep.gsm' (language 'pt_BR') [2014-05-28 11:29:27] VERBOSE[5922] res_agi.c: <SIP/hg-000001c4>AGI Tx >> 200 result=0 (timeout) endpos=19840 [2014-05-28 11:29:28] VERBOSE[5922] res_agi.c: -- <SIP/hg-000001c4>AGI Script speech-recog.agi completed, returning 0 [2014-05-28 11:29:28] VERBOSE[5922] pbx.c: -- Executing [1234@internal:3] Verbose("SIP/hg-000001c4", "1,The text you just said is: -1") in new stack [2014-05-28 11:29:28] VERBOSE[5922] app_verbose.c: The text you just said is: -1 [2014-05-28 11:29:28] VERBOSE[5922] pbx.c: -- Executing [1234@internal:4] Verbose("SIP/hg1500-000001c4", "1,The probability to be right is: -1") in new stack [2014-05-28 11:29:28] VERBOSE[5922] app_verbose.c: The probability to be right is: -1 [2014-05-28 11:29:28] VERBOSE[5922] pbx.c: -- Executing [1234@internal:5] Hangup("SIP/hg-000001c4", "") in new stack [2014-05-28 11:29:28] VERBOSE[5922] pbx.c: == Spawn extension (internal,1234, 5) exited non-zero on 'SIP/hg-000001c4'

lgaetz commented 10 years ago

I am seeing the same thing. Upgraded agi to yesterday's commit, edited with my API key and once I did yum install perl-JSON it appears to be working okay, but confidence and utterance always equal -1

The text you just said is: -1
The probability to be right is: -1

In the Google API portal, I can see my requests incrementing.

I am testing from a Digital Ocean droplet which appears to be giving me an IP address from Indonesia, could that be the issue?

lgaetz commented 10 years ago

I managed to figure out the problem. I was improperly setting up the authorized referrers for the Google API key. Once I deleted all authorized referers, speech recognition works:

The text you just said is: hello this is a test wondering how it works
The probability to be right is: 0.84872508
zaf commented 10 years ago

prevoip: Do you have a key for the speech API? Is Speech API enabled in your Google API console? Do you have any HTTP referers black/white-listed as lgaetz mentioned above? Can you enable debugging on the agi script by setting 'my $debug = 1; ' in the user defined parameters section on the top of the speech-recog.agi file and check the detailed debug output to figure out where it fails?

prevoip commented 10 years ago

Zaf, Yes, I have the key and speech API is enabled HTTP I have not mentioned Enable debug-1, but of the same error -1 anything different than already posted above.

zaf commented 10 years ago

Make sure you capture the debug output of the script. it is displayed only on the console asterisk was started on, so either get the output from tty9 (thats where asterisk starts if you use an init script) or start asterisk manually from the command line(asterisk -cvvv) and capture the output there.

prevoip commented 10 years ago

Debug is activated ( my $debug = 1;) and it is displaying this:

AGI Script speech-recog.agi completed, returning 0 The text you just said is: -1 The probability to be right is: -1

To see what is showing on the debug, I am using the command line: tail -f /var/log/asterisk/full

dvarella commented 10 years ago

Hi Zaf and everybody,

I'm here just to give my little contribution info about the new Speech Recognition Script for Asterisk (using the Google Speech Recognition API v2) and to record this info for everybody.

Since some days ago my Speech Recognition Script for Asterisk stoped to work. Following the instructions on README of the new script package everything got back work. But there are some tricks that should be in attention:

You need a Key from Google's API Console (https://console.developers.google.com) to make this work. This needs to be a "Speech API" key. Follow this instructions: https://developers.google.com/console/help/new/#generatingdevkeys But on the first time at Google's API Console you will not see the "Speech API" option. For this, you need to enter (register) for chromium google group (https://groups.google.com/a/chromium.org/forum/#!forum/chromium-dev) When you go back to Google's API Console, now you can see the "Speech API" option and just turn it On. After you need to go to the left menu at "APIs & auth", "Credentials". And click on the "Create New Key" below the "Public API access" title and choose "Server key" type. On the "Accept requests from these server IP addresses" field, you need to put the IP address or network that your server is located, AND you need to put the IP "127.0.0.1" too (without this IP, It did not work for me). Copy and paste the generated API key inside the new speech-recog.agi script, on the key User defined parameter. Adjust the dialplan to the variables reported from script ("utterance" and "confidence") and you are good go !

PS.: There is a usage limit for this Google API: 1.0 requests/second/user and 50 requests per day.

Regards.

prevoip commented 10 years ago

Igaetz, can you share how you did to create a new id and key generation? Still could not do my work. My Google Developers console: 0 of 50 requests used today.

lgaetz commented 10 years ago

I followed the directions on this page:

  1. Sign up for the chromium dev forum, note that email freq. can be never
  2. at https://cloud.google.com/console click 'create project', wait a long time 'til it shows up in the project list
  3. If your new project doesn't open automatically, select it from the dashboard. From left menu select APIs & auth, APIs and enable 'Speech API'
  4. From left menu select Credentials then "Create new client ID". Create installed application of type other. Create Client ID (thinking this step is not necessary)
  5. From the same page select "create new key", browser key with no referrers.
  6. The page will now display an API key that you can use

I am working from Canada so there may or may not be geo blocks elsewhere.

prevoip commented 10 years ago

Many thanks to zaf and igaetz. Had another update on 31/05 speech-recog.agi and now my now work perfectly. Thank you very much to everyone who helped me!

raciel88 commented 9 years ago

dvarella, a lot of thanks, you solved a lot of problems that I had, :-) Greetings from México.!

dvarella commented 9 years ago

You're welcome !

Cheers !

Daniel Varella de Oliveira Consultor de T.I. Cel.: +55(21)98615-6050

Digium Certified Asterisk Professional - (dCAP)

Novell Certified Linux Administrator (Novell CLA) & Novell Data Center Technical Specialist (Novell DCTS) SUSE Linux Enterprise 11

Linux Professional Certified - LPI

Information Technology Infrastructure Library - ITIL Certified

Cisco Certified Network Associate - CCNA

On Thu, Apr 16, 2015 at 11:44 PM, raciel88 notifications@github.com wrote:

dvarella, a lot of thanks, you solved a lot of problems that I had, :-) Greetings from México.!

— Reply to this email directly or view it on GitHub https://github.com/zaf/asterisk-speech-recog/issues/9#issuecomment-93881895 .

raciel88 commented 9 years ago

I need help.! When I execute this program with Python 2.7:

import speech_recognition as sr r = sr.Recognizer() with sr.Microphone() as source: # use the default microphone as the audio source audio = r.listen(source) # listen for the first phrase and extract it into audio data

try: print("You said " + r.recognize(audio)) # recognize speech using Google Speech Recognition except LookupError: # speech is unintelligible print("Could not understand audio")

I receive the next errors:

File "web1.py", line 4, in audio = r.listen(source) #listen for the first phrase and extract it into audio data

File "/usr/lib/python2.7/dist-packages/speech_recognition/init.py", line 268, in listen buffer = source.stream.read(source.CHUNK)

File "/usr/lib/python2.7/dist-packages/pyaudio.py", line 605, in read return pa.read_stream(self._stram, num_frames)

keyboardInterrupt

Or somebody has a program that works fine using the web speech API,

Please help.!