k2-fsa / sherpa-onnx

Speech-to-text, text-to-speech, and speaker recognition using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift, Dart, JavaScript, Flutter
https://k2-fsa.github.io/sherpa/onnx/index.html
Apache License 2.0
2.03k stars 261 forks source link
aarch64 android arm32 asr cpp csharp dotnet ios linux macos mfc onnx openkylin raspberry-pi risc-v speech-to-text text-to-speech vits windows

Supported functions

Speech recognition Speech synthesis Speaker verification Speaker identification
✔️ ✔️ ✔️ ✔️
Spoken Language identification Audio tagging Voice activity detection Keyword spotting
✔️ ✔️ ✔️ ✔️

Supported platforms

Architecture Android iOS Windows macOS linux
x64 ✔️ ✔️ ✔️ ✔️
x86 ✔️ ✔️
arm64 ✔️ ✔️ ✔️ ✔️ ✔️
arm32 ✔️ ✔️
riscv64 ✔️

Supported programming languages

C++ C Python C# Java JavaScript Kotlin Swift Go Dart
✔️ ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ ✔️

It also supports WebAssembly.

Introduction

This repository supports running the following functions locally

on the following platforms and operating systems:

with the following APIs

Links for pre-built Android APKs

Description URL 中国用户
Streaming speech recognition Address 点此
Text-to-speech Address 点此
Voice activity detection (VAD) Address 点此
VAD + non-streaming speech recognition Address 点此
Two-pass speech recognition Address 点此
Audio tagging Address 点此
Audio tagging (WearOS) Address 点此
Speaker identification Address 点此
Spoken language identification Address 点此
Keyword spotting Address 点此

Links for pre-built Flutter APPs

Description URL 中国用户
Streaming speech recognition Address 点此

Links for pre-trained models

Description URL
Speech recognition (speech to text, ASR) Address
Text-to-speech (TTS) Address
VAD Address
Keyword spotting Address
Audio tagging Address
Speaker identification (Speaker ID) Address
Spoken language identification (Language ID) See multi-lingual Whisper ASR models from Speech recognition
Punctuation Address

Useful links

How to reach us

Please see https://k2-fsa.github.io/sherpa/social-groups.html for 新一代 Kaldi 微信交流群 and QQ 交流群.