Open JasperHorn opened 7 years ago
If you have some code to contribute, a PR would be appreciated. Unfortunately it seems that not many people are experiencing this issue and no one is probably working on it. Thank you.
I managed to get it working for my device (at least for the Alexa service, not the trigger word, but I'm not sure if that is related to the conversion at all). I just haven't gotten around to making it configurable instead of hacking it in with a constant. When I do, I'll make a PR.
Great! When you do, please base it on latest dev
(the stuff has moved into the alexapi.capture
module, etc.).
@JasperHorn Any progress on this? We might actually need resampling for #180.
Not yet. I'll see if I can get to it this week. If I don't (or you want it sooner) I can provide a fragment of code that should make it easy to get this done (it's really just one line of code).
@JasperHorn Please provide the code, so I can play around with it :-)
Here's my full diff: samplerate.diff.txt. The part with the keyword detection can be ignored, since I tried a number of different things there but never got it working.
It's surprisingly simple actually:
fragment, ratecv_state = audioop.ratecv(data, 1, 1, output_rate, input_rate, ratecv_state)
ratecv_state just tracks the internal state of the conversion and should be initialized to None
. Fragment will contain the resampled audio data.
audioop is from the standard library, so you don't need to install anything, just import it.
@JasperHorn Nice, thanks!
I got a little more experience with the audio stuff. What device name do you use? ALSA devices with names plughw:
get conversion in ALSA plug module automatically. See arecord -L
for that - our list doesn't contain that unfortunately :-(
I am using plughw:1
. I'm not sure if I follow what you're saying about that, though.
Hmmm. Send you recording.wav
with current dev
(w/o your resampling code) with this device name in the config.
recording_16k.wav.txt recording_44.1k.wav.txt
Just remove the .txt, I added it to get around GitHub's upload restrictions. The number in the filename is what the argument to inp.setrate()
was equal to.
@JasperHorn Thanks. Just played them and they are both 16 kHz and fine.
When you are on dev
and trigger AlexaPi with a keyboard, button, or something, your speech isn't recognized by Amazon? Do you have anything at https://alexa.amazon.com ? (recognized entries, request history)
The patch file in this thread is out of date, but I was able to get AlexaPi working with my USB headset adapter that only supports 44.1 and 48 kHz by forcing ALSA to use the plug resampler globally:
/etc/asound.conf
:
pcm.!default {
type plug
slave {
pcm "hw:1,0"
}
}
ctl.!default {
type plug
slave {
pcm "hw:1,0"
}
}
While hotword detection doesn't work too great out of the box with this setup (maybe pocketsphinx is bothered by resampling artifacts or something), it drastically improves if you use Snowboy instead.
My microphone which only supports sampling rates of 44.1kHz to 48kHz sampling rates cannot be used with AlexaPi.
The Alexa web service only supports 16kHz and webrtcvad supports 8kHz, 16kHz and 32kHz, so in order to support sampling rates other than 16kHz, a conversion would have to be done.
Currently, it is just assumed that the microphone supports 16kHz sampling, and the result (if it doesn't) is that the sampling rate is mislabeled, no voice is recognized and the whole thing just silently fails.