Open nanaghartey opened 7 months ago
We support that using c++, python and other APIs, but not on Android and iOS.
Is there any reason to use Android to do that task? Where does the wave file come from?
Can you implement it on android/ios, on-device? There are many use cases . Wave files come from various sources(third party apps, inbuilt etc) In my case, I'm doing video dubbing with Sherpa-onnx but because it does not accept wave files I get stuck at the transcription part.
It works with vosk-android but accuracy of vosk models are poor.
I upload an mp4 and convert to wav then I perform transcription on the wav file retrieving all segments with timestamps, translate and superimpose it back on the video.
Wav file transcriptions in other apps are used to perform audio analysis etc
Adding such feature will be very beneficial especially for small developers who can't afford server costs
It works with vosk-android
Could you post the URL for vosk-android that can accept wave files?
It works with vosk-android
Could you post the URL for vosk-android that can accept wave files?
Here you go https://github.com/alphacep/vosk-android-demo
Set rec.setWords(true);
on the recognizer to get the start and end times:
Recognizer rec = new Recognizer(model, 16000.f);
rec.setWords(true);
@csukuangfj any updates on this?
First, Great repo! I checked out the android ASR demo and it seems it can only transcribe through mic source. Can it transcribe wav files too? (Example a recorded interview or meeting) Most ASR frameworks support this feature (E.g open ai whisper api , vosk etc) . It will be very useful if Sherpa-onnx for mobile can accept wav files and output a transcript of the provided audio in text with timestamps at the segment, word level, or both. This will enable precision for wav file transcripts