StevenHickson / PiAUISuite

Raspberry PI AUI Suite
Other
695 stars 210 forks source link

Raspberry PI 2 always returns no translation. #55

Open craql opened 8 years ago

craql commented 8 years ago

Hey Steve, it seems like you've got a great plugin, but I can't seem to get it working. I'm using a Pi 2. is it not compatible with that model? Also, the usb headset I'm using records and plays back audio just fine. Whenever I run voicecommand, it returns with "No translation" and no audio response.

Thanks for your help.

alx5962 commented 8 years ago

I had a similar problem, but on Banana Pi. Default framerate used in speech-recog.sh is 16000 and it's not working for me. I tried several ones and 32000 or 48000 are working fine! Also the file format is important (mine is S16_LE). Try something like: arecord -D "plughw:2,0" -f S16_LE -d 2 -r 32000 /dev/shm/out.wav then aplay /dev/shm/out.wav

If everything works fine, just update speech-recog.sh (located in /usr/bin).If not, try a different frame rate or file format.

patrickreidglennon commented 8 years ago

I'm wondering if the Google speech api is working at the moment, or if there's a requirement for a key now or something? I'm also returning no translations, but the noise.wav files are clear and perfect

ripleyXLR8 commented 8 years ago

It seems that google doesn't accept Stereo recording.

A solution is to use in speechrecognition.sh :+1:

arecord -D $hardware -f S16_LE -d $duration -r 16000 | flac - -f --best --sample-rate 16000 -o /dev/shm/out.flac 1>/dev/shm/voice.log 2>/dev/shm/voice.log; curl -X POST --data-binary @/dev/shm/out.flac --user-agent 'Mozilla/5.0' --header 'Content-Type: audio/x-flac; rate=16000;' "https://www.google.com/speech-api/v2/recognize?output=json&lang=$lang&key=AIzaSyBOti4mM-6x9WDnZIjIeyEU21OpBXqWBgw&client=Mozilla/5.0" | sed -e 's/[{}]/''/g' | awk -F":" '{print $4}' | awk -F"," '{print $1}' | tr -d '\n'

and to recompile voicecommand.cpp with the following modification on the getvolume function :+1:

inline float GetVolume(string recordHW, string com_duration, bool nullout) { FILE *cmd; float vol = 0.0f; string run = "arecord -D "; run += recordHW; run += " -f S16_LE -d "; run += com_duration; run += " -r 16000 /dev/shm/noise.wav"; if(nullout) run += " 1>>/dev/shm/voice.log 2>>/dev/shm/voice.log"; system(run.c_str()); cmd = popen("sox /dev/shm/noise.wav -n stats -s 16 2>&1 | awk '/^Max\\ level/ {print $3}'","r"); fscanf(cmd,"%f",&vol); fclose(cmd); return vol; }

galuhboy123 commented 8 years ago

where u can find voicecommand.cpp? is it on setup installation PiAUISuite ? and we should recompile that file ? thank for help @ripleyXLR8

krist-jin commented 8 years ago

The answer of @ripleyXLR8 works for me!

How to fix:

  1. go to the VoiceCommand folder
  2. edit speechrecognition.sh as @ripleyXLR8 mentioned above (basically it's change "-f cd -t wav" to "-f S16_LE")
  3. edit voicecommand.cpp as @ripleyXLR8 mentioned above (basically it's change "-f cd -t wav" to "-f S16_LE" )
  4. sudo apt-get install g++-4.8
  5. make
  6. go to install folder and sudo ./InstallAUISuite.sh
  7. change the keyword in configuration to something else because "Pi" is easy to be recognized as "pie"... And then you are good to go!

Why: I think the problem is that google speech api doesn't support muti-channel recording somehow, which will return an empty result. "arecord -f cd" equals to "arecord -f S16_LE -c2 -r44100]" which means there are two channels when recording. If you set it to single channel it will work.

You can do some experiment to prove this: first, you do the recording with arecord -D plughw:1,0 -f cd -t wav -r 16000 test.wav, and curl -X POST --data-binary @'test.wav' --header 'Content-Type: audio/l16; rate=16000;' 'https://www.google.com/speech-api/v2/recognize?output=json&lang=en-us&key=AIzaSyBOti4mM-6x9WDnZIjIeyEU21OpBXqWBgw' to call the speech api then you do the rrcording with arecord -D plughw:1,0 -f S16_LE -r 16000 test.wav, and call the google api. You will find the the second way works but the first way results you nothing.

However, I have no idea why google does not support multi-channel...

Colin1964 commented 8 years ago

Krist-jin

tried your step by step of RipleyXLR8 fix but TTS is still not working for me. Was this only a fix before Google stopped TTS service?

krist-jin commented 8 years ago

@Colin1964 Sorry for the confusion but this fix was for the google speech recognition api, not text to speech api. I just gave another fix for the tts in #56 hope it helps

Colin1964 commented 8 years ago

@krist-jin My bad - I was having TTS and STT problems with this and looking at this issue and #56 but all sorted now (see update on #56) - well almost.. She can translate what I say and will speak back to me but I still can't get her to respond to the keyword?

pinftv commented 8 years ago

@krist-jin when I recompile voicecommand and install it again then when I'm trying to run voicecommand I get error message: Illegal Instruction. I need some help

krist-jin commented 8 years ago

@pinftv Can you provide what have you run and the full error message?

pinftv commented 8 years ago

@krist-jin I was followed your instruccions but when running voicecommand I was getting Illega instruction. but I have solved it instead of changing to S16_LE, after -r 1600 I added -c 1 and now it works normally :)

HangLoooose commented 8 years ago

I followed all instructions given by @krist-jin Now speech-recog.sh kann translate my voice into the correct text and tts says whatever you want it to say. But as soon as I start voicecommand and say my keyword the following lines appear:

Found audio Recording WAVE 'stdin' : Signed 16 bit Little Endian, Rate 16000 Hz, Mono % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 25258 0 14 100 25244 9 16760 0:00:01 0:00:01 --:--:-- 16762 No translation

Can anyone help me to fix this issue?

siddharthksuri commented 7 years ago

This thread is a life saver. Incorporated fixes as mentioned by ripleyxlr8 and clarified by krist-jin. I'm now able to get voice recognition working ( i.e. Speech to text ). Will try to get TTS over the next week, from thread 56.

I'm using a Pi3 running Raspbian Jessie, and a cheap $2 mike from Amazon.

derrapf commented 7 years ago

Hi maybe somebody can also help me. I'm using a Pi3 with Raspbian. When I run voicecommand -s, it somewhere along the way asks me if I would hear a sound. Unfortunatelly I don't. Then I tried speech-recog.sh and only got the following output Aufnahme: WAVE 'stdin' : Unsigned 8 bit, Rate: 16000 Hz, mono % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 21618 0 14 100 21604 9 15385 0:00:01 0:00:01 --:--:-- 15398

I found that the flac conversion does not work for some reason: When i perform: arecord -D "plughw:0,0" -t wav -d 3 -r 16000 test.wav I get arecord: main:722: Fehler beim Öffnen des Gerätes: Datei oder Verzeichnis nicht gefunden which means "Error opening device: File or directory not found. I then tried arecord -t wav -d 3 -r 16000 test.wav and I got a nice recording which I can play with aplay. There is some hum but I can live with it.

Then I tried arecord -t wav -d 3 -r 16000 | flac - -f --best --sample-rate 16000 -o /dev/shm/out.flac But when I try to play it with aplay /dev/shm/out.flac I only can hear terrible white noise.

Now I have no idea how I can fix that.

Any help appreciated Rlaf

disheet commented 7 years ago

For me its giving this error ... plz help me..

pi@raspberrypi:~/PiAUISuite/VoiceCommand $ sudo voicecommand -c Opening config file... running in continuous mode keyword duration is 2 and duration is 3 Found audio arecord: main:556: unrecognized file format S16_LE Warning: Couldn't read data from file "/dev/shm/out.flac", this makes an empty Warning: POST. % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 14 0 14 0 0 11 0 --:--:-- 0:00:01 --:--:-- 11 rm: cannot remove ‘/dev/shm/out.flac’: No such file or directory No translation

disheet commented 7 years ago

Do you want to permanently change the default duration of the speech recognition (3 seconds)? (y/n) n Do you want to permanently change the default command duration of the speech recognition (2 seconds)? (y/n) y Type the number of seconds you want it to run: ex 2 2 Do you want to set up and check the text to speech options? (y/n) y First I'm going to say something and see if you hear it /usr/bin/tts: line 13: /dev/shm/speak.mp3: Permission denied /dev/shm/tmp.mp3: Permission denied /usr/bin/tts: line 40: /dev/shm/speak.mp3: Permission denied /usr/bin/tts: line 42: /dev/shm/voice.log: Permission denied

while entering voicecommand -s it giving some error