nyumaya / nyumaya_audio_recognition

Classify audio with neural nets on embedded systems like the Raspberry Pi
https://nyumaya.com
Apache License 2.0
82 stars 14 forks source link

streaming_example.py: arecord: main:788: audio open error: No such file or directory #2

Closed torntrousers closed 5 years ago

torntrousers commented 5 years ago

Dusted off an old Pi Zero and I'm trying to run the streaming_example.py example, but it fails with:

python streaming_example.py --libpath ../lib/rpi/armv6/libnyumaya.so
Audio Recognition Version: 0.0.2
arecord: main:788: audio open error: No such file or directory

I'm using an I2S mic and I've been through the setup described at https://learn.adafruit.com/adafruit-i2s-mems-microphone-breakout/raspberry-pi-wiring-and-test#raspberry-pi-i2s-configuration and that successfully records sound from the mic when doing arecord -D plughw:1 -c1 -r 48000 -f S32_LE -t wav -V mono -v file.wav

Any ideas? Is there some extra config somewhere I need to get streaming_example.py to work?

yodakohl commented 5 years ago

The example currently only captures audio from the default device. You can find the corresponding code in record.py at line 119: " self.input_device = 'default' ". You can either change your alsa configuration to make plughw:1 the default device or set self.input_device to self.input_device = 'plughw:1' . Also note that the example expects an output device for playing a "Ding" when the keyword is detected. This is done in the streaming_example.py line 34: "os.system("aplay ./ding.wav"). You can uncomment this if you get a audio output error.

torntrousers commented 5 years ago

Thanks for the quick reply. Ok that fixes that problem and python streaming_example.py doesn't get any errors now. Sadly I can't get it to detect any speech though. By default its using models/Hotword/sheila_big.tflite, right? So I should say 'sheila'? Are there ways to debug whats not working?

yodakohl commented 5 years ago

I just recognized that sheila_big still is the default. It's at the limit what the Pi Zero can do, so it would be better to change that to sheila_small.tflite.

yodakohl commented 5 years ago

Also the default sensitivity was incorrectly set to 0.05 instead of 0.5. I pushed the changes to the repository.

torntrousers commented 5 years ago

Thanks. But still nothing.

yodakohl commented 5 years ago

Are you sure you're actually capturing audio? Can you record a file with arecord -D plughw:1 -c1 -r 16000 -f S16_LE -t wav -V mono -v file.wav and listen to it? What volume reading are you getting?

torntrousers commented 5 years ago

Right, no that doesn't work and just records silence. That only works if i change it to use -f S32_LE.

torntrousers commented 5 years ago

Would that be the mic (its https://www.adafruit.com/product/3421) or some driver setting or something?

yodakohl commented 5 years ago

The mic seems fine and this will be a driver issue. If your Raspbian version is up to date you don't have to compile the driver anymore. It's sufficient to edit the /boot/config.txt and add the dtoverlay and disable the default audio.

dtoverlay=googlevoicehat-soundcard

dtparam=audio=on

Move any /etc/asoundrc or /home/pi/.asound.conf to some backup (/etc/asoundrc.bak) and then reboot.

The output of arecord -l should show

List of CAPTURE Hardware Devices card 0: sndrpigooglevoi [snd_rpi_googlevoicehat_soundcar], device 0: Google voiceHAT SoundCard HiFi voicehat-hifi-0 [] Subdevices: 1/1 Subdevice #0: subdevice #0

This would be automatically the default input device. The input will be very silent but everything should work out of the box. If the recognition rate is not very good it can help to set the gain in the streaming example to something like 8.

yodakohl commented 5 years ago

You probably have to remove the line 'my_loader' in vi /etc/modules and reboot or your compiled driver might create some interference

torntrousers commented 5 years ago

You're a star, it works and recognises 'marvin' now!

torntrousers commented 5 years ago

Hi again. This is working but I need to be close to the mic or speak very loudly. If I record with arecord -D plughw:0 -c1 -r 16000 -f S16_LE -t wav -V mono -v file.wav and play that its really quiet. I've tried adding hotword_detector.SetGain(9) but it doesn't seem to make any difference. Do you know is there some way to increase the mic sensitivity somehow?

yodakohl commented 5 years ago

Ok, I see the problem. The gain parameter currently isn't used internally. I'm fixing this.

yodakohl commented 5 years ago

Changes are pushed and you will need to compile the lib again. I did a quick test and it seems ok, but I will do some more testing tomorrow. I also try to write some calibration code that finds a suitable gain.

torntrousers commented 5 years ago

Thanks for this. Isn't the low mic level the root of the problem though and I should try to fix that? Googling there seem like a lot of people complaining about low levels with the SPH0645 mic on Pi's. I found this https://docs.nyumaya.com/manuals/zenbu-manual#verify-that-the-sound-card-is-recognized but haven't got that to work yet. Would a different mic make a difference, what mic do you use? I also found this https://iot.stackexchange.com/questions/2193/good-microphone-for-whole-room-without-internet and wonder about trying the PS Eye as they're quite cheap, do you know if they would work?

yodakohl commented 5 years ago

I'm using my own hat with ICS43434 Mics. The Signal from the mic is digital so increasing the volume in alsa has exactly the same effect as applying a gain in software. I'm not a fan of the SPH0645 because it gave me a DC-offset which lead to clipping for higher gains.

I have a PS Eye and can try how well it works.

How loud do you have to talk? My mics are recording at a similar volume as the SPH0645 and I can trigger the hotword by almost whispering across the room? Maby you can record a short sample so I can understand better what's going on here?

torntrousers commented 5 years ago

I've been looking for ICS43434's, seem hard to come by, guess I should buy your hat.

This is a recording of me a few metres away from the mic (renamed to .txt to attach here, does that work?): file.wav.txt

yodakohl commented 5 years ago

I amplified the signal with audacity and the dc offset is pretty big. I will do some quick tests to determine if this is the root cause of the problem. There might still be a software fix possible by removing the DC-Offset in Software

knowlesmic

yodakohl commented 5 years ago

Removing the DC-Offset showed some big improvement in accuracy. I'm trying to find an efficient way to do this in software.

yodakohl commented 5 years ago

Ok, changes are pushed and the lib for the Pi0 is updated as well. You can enable the option to remove the DC-Offset by adding detector.RemoveDC(True) to the streaming_example. It's not going to be a perfect solution but it should work reasonably well.

torntrousers commented 5 years ago

Amazing, thank you! I can't try it till later as working at a different location today.

torntrousers commented 5 years ago

Works really well now, with the gain set up a bit and RemoveDC then speaking quite quietly across the room can switch the light on and off. Really impressed!