Supported functions
Speech recognition |
Speech synthesis |
✔️ |
✔️ |
Speaker identification |
Speaker diarization |
Speaker identification |
✔️ |
✔️ |
✔️ |
Spoken Language identification |
Audio tagging |
Voice activity detection |
✔️ |
✔️ |
✔️ |
Keyword spotting |
Add punctuation |
✔️ |
✔️ |
Supported platforms
Architecture |
Android |
iOS |
Windows |
macOS |
linux |
x64 |
✔️ |
|
✔️ |
✔️ |
✔️ |
x86 |
✔️ |
|
✔️ |
|
|
arm64 |
✔️ |
✔️ |
✔️ |
✔️ |
✔️ |
arm32 |
✔️ |
|
|
|
✔️ |
riscv64 |
|
|
|
|
✔️ |
Supported programming languages
1. C++ |
2. C |
3. Python |
4. JavaScript |
✔️ |
✔️ |
✔️ |
✔️ |
5. Java |
6. C# |
7. Kotlin |
8. Swift |
✔️ |
✔️ |
✔️ |
✔️ |
9. Go |
10. Dart |
11. Rust |
12. Pascal |
✔️ |
✔️ |
✔️ |
✔️ |
For Rust support, please see sherpa-rs
It also supports WebAssembly.
Introduction
This repository supports running the following functions locally
- Speech-to-text (i.e., ASR); both streaming and non-streaming are supported
- Text-to-speech (i.e., TTS)
- Speaker diarization
- Speaker identification
- Speaker verification
- Spoken language identification
- Audio tagging
- VAD (e.g., silero-vad)
- Keyword spotting
on the following platforms and operating systems:
with the following APIs
- C++, C, Python, Go,
C#
- Java, Kotlin, JavaScript
- Swift, Rust
- Dart, Object Pascal
Links for Huggingface Spaces
You can visit the following Huggingface spaces to try sherpa-onnx
without
installing anything. All you need is a browser.
We also have spaces built using WebAssembly. They are listed below:
Links for pre-built Android APKs
Links for pre-built Flutter APPs
Real-time speech recognition
Description |
URL |
中国用户 |
Streaming speech recognition |
Address |
点此 |
Text-to-speech
Note: You need to build from source for iOS.
Links for pre-built Lazarus APPs
Generating subtitles
Description |
URL |
中国用户 |
Generate subtitles (生成字幕) |
Address |
点此 |
Links for pre-trained models
Useful links
How to reach us
Please see
https://k2-fsa.github.io/sherpa/social-groups.html
for 新一代 Kaldi 微信交流群 and QQ 交流群.
Projects using sherpa-onnx
Streaming ASR and TTS based on FastAPI
It shows how to use the ASR and TTS Python APIs with FastAPI.
Uses streaming ASR in C# with graphical user interface.
Video demo in Chinese: 【开源】Windows实时字幕软件(网课/开会必备)
It uses the JavaScript API of sherpa-onnx along with Electron
Video demo in Chinese: 爆了!炫神教你开打字挂!真正影响胜率的英雄联盟工具!英雄联盟的最后一块拼图!和游戏中的每个人无障碍沟通!