alphacep / vosk-android-demo

Offline speech recognition for Android with Vosk library.
Apache License 2.0
717 stars 189 forks source link

Sampling rate #118

Open sidra718 opened 3 years ago

sidra718 commented 3 years ago

Hello,

I am using AudioManager from Android studio to capture sound from bluetooth headset instead of the internal microphone of my tablet. However, the following restrictings apply on input streams:

When I change the sampling rate from 16KHz to 8KHz when defining the model and speechService, the app crashes. Perhaps you might have any solution to help me?

OscarVanL commented 3 years ago

This will be because the model is expecting (and trained on) 16kHz speech.

You could try upsampling the input from 8kHz to 16kHz on the app (I'd have no idea how), find a model trained on 8kHz (again, I don't know where from), or train one yourself.

sidra718 commented 3 years ago

Yes, I figured that out. I am also not sure how to upsample the input or find a trained model on 8kHz. I don't have a problem with the model however. I just want to send my voice message through a bluetooth headset instead of the internal microphone of my tablet.

OscarVanL commented 3 years ago

Sure, but unless you can receive input from your Bluetooth headset at 16kHz you're going to need to do either of those things. And I'm not sure either, I've not had to do it.

mmende commented 1 year ago

You can add --allow-upsample=true in conf/mfcc.conf inside the model directory. This way 8kHz should also be accepted by the recognizer.

To do that it might be possible to simply unzip the aar archive to adjust conf/mfcc.conf accordingly. Your might also need to recreate the new md5 checksum (md5 assets/model-en-us/conf/mfcc.conf) and update the value inside sync/model-android/conf/mfcc.conf.md5.