Speech-to-text, text-to-speech, speaker diarization, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift, Dart, JavaScript, Flutter, Object Pascal, Lazarus, Rust
3.41k
stars
407
forks
source link
Whisper Language ID: Output all languages and their probabilities #1475
Open
marchellodev opened 2 hours ago
Currently, the language ID based whisper outputs only one language with the top probability:
(from @sherpa-rs)
It would be great if sherpa could output the full result -- all languages and associated accuracies.
Faster-whisper, for example, can do this: https://github.com/SYSTRAN/faster-whisper/blob/c2a1da1bd94e002c38487c91c2f6b50a048000cf/faster_whisper/transcribe.py#L1764