alexylem / jarvis

Jarvis.sh is a simple configurable multi-lang assistant.
http://openjarvis.com
MIT License
805 stars 197 forks source link

PLAY failed when too much background noise with snowboy recorder #487

Open alexylem opened 7 years ago

alexylem commented 7 years ago

Description

When there is too much background noise, after 10 seconds, snowboy recorder stops and there is a blocking error message that exits Jarvis.

This is because of the timeout which "kills" the snowboy recorder (that is still waiting for the silence that will never come) and therefore there is no wav to play. The way "sox" recorder does it today is that it stops by itself after 5 seconds of voice (the timeout of 10 secs is for silence). Then Jarvis program (not part of the recorder) checks the wav duration is below 4 seconds (duration of a normal command) and if not, says that there is too much background noise.

@guillaumef adding you as you are the expert. I think the solution is to add a max nb of voice ticks that corresponds to 5 secs. If you don't have the time I can implement it.

Result

See end of below recording: https://asciinema.org/a/2vb3z8lmt5oa9pyyld5uonb1p

utils/timeout.sh 10 python  recorders/snowboy/main.py 5 /tmp/jarvis-record.wav 
____||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||DEBUG: stop_listening hook
play FAIL formats: can't open input file `/tmp/jarvis-record.wav': No such file or directory
ERROR: play command failed
HELP: Verify your speaker in Settings > Audio > Speaker

DEBUG: program_exit hook
bash-3.2$ exit

Note: here the file is on /tmp because I test it on my mac and there is not /dev/shm folder.

alexylem commented 7 years ago

@guillaumef I added you collaborator of the project. You can now do everything! (please do not delete master branch...) Please push/PR you changes to the new beta branch. I will take care of the merge to master.

guillaumef commented 7 years ago

Hi @alexylem I understand the problem but I think there is two tracks to follow here. Let me try to explain.

Since the recorders are now modular, the error management (wav ok, too much noise, error in audio layer, etc...) coming back from any recorder could be unified with classic error codes. The timeout for noise could be a constraint on the recorder layer, which allows the recorder layer to do what it needs to manage it: with a raw kill or a clean shutdown or anything. So, I have the feeling this 4sec/5sec trick which is specific in treatment with sox should migrate in sox recorder layer. This check is done on sox recorder because it is the only way to validate the wav output. But for other recorder layer, it could also save Jarvis 'core' a validation of the wav coming from it. Jarvis 'core' should trust its own recorder layers to do the job.

If this first track is ok for you, the snowboy recorder could implement its own mechanism to match the constraint of 5 seconds. The ticks coming back from snowboy are not a reliable 'clock' and I think these ticks duration changes slightly from a platform to another. The 30 milliseconds sleep is triggered in case of "no-data" return from the snowboy layer, but how many time is spent in snowboy layer and how many time it takes to get a raw block, it all depends on the hardware i think (platform, mic, ...). For managing ticks and the data pattern we are looking for (Silence/Voice/...), it is really good enough and allows to map strictly the pattern to the snowboy processing, but to manage a real timeout, I don't think it is reliable. To do it, it is therefore quite simple, with a classic time checking in the snowboy loop in wavget.py. If the voice is not started (pre-silence not reached, voice==0) and timeout reached, it can exit properly (close device, ...) with the good error code. No kill. Peace and Love ;-) And no need to check any wav with sox in Jarvis 'Core' anymore.

alexylem commented 7 years ago

Can't agree more. I had initially implement timeout because Sox did not have a way to stop by himself if no voice within 10 secs. So let's keep timeout for sox and define as you suggest global return codes for each case. Will share that soon once implemented in sox recorder.

Le 19 mars 2017 à 10:57, Guillaume F notifications@github.com a écrit :

Hi @alexylem I understand the problem but I think there is two tracks to follow here. Let me try to explain.

Since the recorders are now modular, the error management (wav ok, too much noise, error in audio layer, etc...) coming back from any recorder could be unified with classic error codes. The timeout for noise could be a constraint on the recorder layer, which allows the recorder layer to do what it needs to manage it: with a raw kill or a clean shutdown or anything. So, I have the feeling this 4sec/5sec trick which is specific in treatment with sox should migrate in sox recorder layer. This check is done on sox recorder because it is the only way to validate the wav output. But for other recorder layer, it could also save Jarvis 'core' a validation of the wav coming from it. Jarvis 'core' should trust its own recorder layers to do the job.

If this first track is ok for you, the snowboy recorder could implement its own mechanism to match the constraint of 5 seconds. The ticks coming back from snowboy are not a reliable 'clock' and I think these ticks duration changes slightly from a platform to another. The 30 milliseconds sleep is triggered in case of "no-data" return from the snowboy layer, but how many time is spent in snowboy layer and how many time it takes to get a raw block, it all depends on the hardware i think (platform, mic, ...). For managing ticks and the data pattern we are looking for (Silence/Voice/...), it is really good enough and allows to map strictly the pattern to the snowboy processing, but to manage a real timeout, I don't think it is reliable. To do it, it is therefore quite simple, with a classic time checking in the snowboy loop in wavget.py. If the voice is not started (pre-silence not reached, voice==0) and timeout reached, it can exit properly (close device, ...) with the good error code. No kill. Peace and Love ;-) And no need to check any wav with sox in Jarvis 'Core' anymore.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

DomoAI commented 7 years ago

Hi,

I just install last night Jarvis and I have exactly the same problem. The solution it to modified snowboy loop in wavget.py ? I will take look tonight.

guillaumef commented 7 years ago

The problem I had with sox was erratic triggering of the sentence without waiting for the end. With snowboy wavget, if the room is too noisy or have permanent noise, the solution should be to simply lower the 'gain' of the microphone (there is two gain: the snowboy gain, which minimum is '1' and can be increased - it is configured via the jarvis interface, and the second one is the gain of the microphone in alsamixer) The goal of this is to detect a whole sentence, which is a silence + voice + silence. The last silence is triggering the wav return. You can check if the gain level fits with this command

$ ./jarvis.sh -q
$ python -d recorders/snowboy/main.py 1 track
...
__________||||||||____

'|' are voice trigger.

To manage gain on microphone with alsa:

$ alsamixer -V capture

You need a properly configured asoundrc, here is mine (with the ctl.!default):

$ cat ~/.asoundrc 
pcm.!default {
  type asym
   playback.pcm {
     type plug
     slave.pcm { type hw card PCH }
   }
   capture.pcm {
     type plug
     slave.pcm { type hw card AK5371 }
   }
}
ctl.!default {
    type hw card AK5371
}

The AK5371 is specific to my hardware. Using the card name is more reliable than using the id. You can get the name here:

# cat /proc/asound/cards
 0 [HDMI           ]: HDA-Intel - HDA Intel HDMI
                      HDA Intel HDMI at 0xaa134000 irq 48
 1 [PCH            ]: HDA-Intel - HDA Intel PCH
                      HDA Intel PCH at 0xaa130000 irq 46
 3 [AK5371         ]: USB-Audio - AK5371
                      AKM AK5371 at usb-0000:00:14.0-4.1.2, full speed
DomoAI commented 7 years ago

Thank you, I will take a look tonight. Because for me its was like Jarvis was in constant loop. The first start it works, then I start to say "Jarvis, quel heure est il" and it gave me the time but it also played the last record "Jarvis, quel heure est il" and so on. Then I exit the program, and when I restart it I did get the can't open input file `/tmp/jarvis-record.wav'. I will try again tonight or tomorrow and let you know which error exactly I get.

alexylem commented 7 years ago

@DomoAI "snowboy" recorder is still in alpha version. Please use "sox" in the meanwhile.

alexylem commented 7 years ago

@guillaumef makes me think I can make a much simpler hardward selection using the name instead of id. Why is the "ctl.!default" needed? Will log a separate ticket for this.

guillaumef commented 7 years ago

the ctl.!default allows the alsamixer to switch to the good card directly. You can also reach it with that also:

$ alsamixer -c AK5371 -V capture

Perhaps it could be embedded in jarvis.sh config ? Once you have the card name (you already have the card selector with id as i remember), you can launch alsamixer to configure the gain of the microphone at alsa level (which I think is better than overriding the gain at snowboy level).

On my htpc, i have a Teac DAC connected with usb2.0 and a A/V amplifier connected on hdmi. Those two gears are powered on only if I am, respectively, in stereo and video mode. This means I have devices which are popping in the alsa cards list and can take the alsa ID of the Microphone (AK5371), this is why IDs are not very reliable. Card name is "bullet proof".

alexylem commented 7 years ago

@guillaumef I was using alsamixer to set the gain before. But on some cards (including my PSEye), it can just not be managed by alsamixer:

$> alsamixer -c CameraB409241
cannot load mixer controls: Invalid argument

Same error if I change card from within alsamixer. See https://github.com/alexylem/jarvis/issues/26#issuecomment-231948328. For this reason I decided not to keep alsamixer as I want Jarvis to be usable in all situations and limit exceptions as much as possible. Speaker/Mic selection from Name ticket logged here: https://github.com/alexylem/jarvis/issues/496

DomoAI commented 7 years ago

Hello, Here is the debug. I see me Local are not set correctly. Also it repeat everything I say. I change the recorder to ''sox''.

------------ Config ------------
jv_branch            master 
jv_version           17.03.19 
jv_arch              x86_64 
jv_os_name           ubuntu 
jv_os_version        16.04 
language             fr_FR 
play_hw              hw:2,1 
rec_hw               hw:1,0 
speaker               
microphone           Microsoft Corp.  
recorder             sox 
trigger_stt          snowboy 
command_stt          bing 
tts_engine           svox_pico 
--------------------------------

DEBUG: program_startup hook
Alfred: Hello
DEBUG: start_speaking hook
DEBUG: stop_speaking hook
User defined commands:
*AIDE*          *BONJOUR*|*SALUT*   *COMMENT*APPELLE*
*MERCI*         *AU REVOIR*|*BYE*   ANNULE*|TERMINE*
ENCORE*         *TEST*          *VERSION*
*REPETE (*) ET (*)  *CA VA*         >*OUI*
>*NON*|*PAS*
Commands from plugin jarvis-bruitages:
*FAIS (*)       QUE*BRUITAGE*       
Commands from plugin jarvis-math:
*CALCUL* (*)
Commands from plugin jarvis-time:
*QUELLE HEURE*      *QUEL JOUR*
Commands from plugin jarvis-weather-wunderground-fr:
*METEO*DEMAIN*      *METEO*
Alfred: Waiting to hear 'Alfred'
master: (listening...)
DEBUG: models=alfred
INFO:snowboy:Ticks: [2, 20, 5, -1]
INFO:snowboy:Keyword 1 detected at time: 2017-03-25 10:57:46
INFO:snowboy:Ticks status: 2 3 1 1
DEBUG: modelid=0
Alfred
DEBUG: entering_cmd hook
master: (listening...)
DEBUG: start_listening hook
utils/timeout.sh 10 rec -V1 -q -r 16000 -c 1 -b 16 -e signed-integer --endian little /dev/shm/jarvis-record.wav gain 24 silence 1 0.1 5 1 0.5 5 trim 0 5
DEBUG: speech duration was 02 (10 = 1 sec)
DEBUG: stop_listening hook
DEBUG: curl https://speech.platform.bing.com/recognize/query?version=3.0&requestid=eadfc8b9-d11c-4644-9625-b0e3611a6331&appid=D4D52672-91D7-4C74-8AD8-42B1D98141A5&format=json&locale=fr-FR&device.os=linux&scenarios=ulm&instanceid=E043E4FE-51EF-4B74-8133-B728C4FEA8AA&result.profanitymarkup=0
DEBUG: json={"version":"3.0","header":{"status":"error","properties":{"requestid":"0f13f28b-3b84-47ce-90e0-e2865b92c32f","NOSPEECH":"1"}}}
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
    LANGUAGE = (unset),
    LC_ALL = (unset),
    LC_PAPER = "nl_NL.UTF-8",
    LC_ADDRESS = "nl_NL.UTF-8",
    LC_MONETARY = "nl_NL.UTF-8",
    LC_NUMERIC = "nl_NL.UTF-8",
    LC_TELEPHONE = "nl_NL.UTF-8",
    LC_IDENTIFICATION = "nl_NL.UTF-8",
    LC_MEASUREMENT = "nl_NL.UTF-8",
    LC_CTYPE = "UTF-8",
    LC_TIME = "nl_NL.UTF-8",
    LC_NAME = "nl_NL.UTF-8",
    LANG = "en_US.UTF-8"
    are supported and installed on your system.
perl: warning: Falling back to a fallback locale ("en_US.UTF-8").
?(listening...)
DEBUG: start_listening hook
utils/timeout.sh 10 rec -V1 -q -r 16000 -c 1 -b 16 -e signed-integer --endian little /dev/shm/jarvis-record.wav gain 24 silence 1 0.1 5 1 0.5 5 trim 0 5
DEBUG: stop_listening hook
play FAIL formats: can't open input file `/dev/shm/jarvis-record.wav': WAVE: RIFF header not found
ERROR: play command failed
HELP: Verify your speaker in Settings > Audio > Speaker

DEBUG: program_exit hook
alexylem commented 7 years ago

@DomoAI please do not pollute this ticket that is for snowboy recorder. Please open a new ticket.

DomoAI commented 7 years ago

I know I was using snowboard but I still get the same error. Please take a look at the log

On 25 Mar 2017 11:30, "Alexandre Mély" notifications@github.com wrote:

@DomoAI https://github.com/DomoAI please do not pollute this ticket that is for snowboy recorder. Please open a new ticket.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/alexylem/jarvis/issues/487#issuecomment-289203227, or mute the thread https://github.com/notifications/unsubscribe-auth/AZahP4SkQFTjRIsqBFbFxNLrdCZMzqmkks5rpOzRgaJpZM4MhqpS .

alexylem commented 7 years ago

I looked at your logs, you were using:

recorder             sox 

Although the bottom line is the same, you pb has nothing to do with this ticket. Please log another ticket.

DomoAI commented 7 years ago

I have the same problem with snowboy. sorry if it's at the wrong place I will post the log with snowboy

On 25 Mar 2017 11:50, "Alexandre Mély" notifications@github.com wrote:

I looked at your logs, you were using:

recorder sox

Although the bottom line is the same, you pb has nothing to do with this ticket. Please log another ticket.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/alexylem/jarvis/issues/487#issuecomment-289204243, or mute the thread https://github.com/notifications/unsubscribe-auth/AZahPx3t9rVe7Vb948lSw3Rd5c9tPdXjks5rpPFhgaJpZM4MhqpS .

alexylem commented 7 years ago

Yes it is the wrong place (this ticket is for a very specific issue). Please log a new ticket and post log with sox & snowboy over here. Thanks. I will delete all your posts here to clean it up.