k2-fsa / sherpa-onnx

Speech-to-text, text-to-speech, speaker diarization, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift, Dart, JavaScript, Flutter, Object Pascal, Lazarus, Rust
https://k2-fsa.github.io/sherpa/onnx/index.html
Apache License 2.0
3.7k stars 431 forks source link

fast-beam-search support for sherpa-onnx #347

Open uni-manjunath-ke opened 1 year ago

uni-manjunath-ke commented 1 year ago

I found that the fast-beam-search decoding is currently not supported in sherpa-onnx. Is this activity is planned for future? If yes, when can this be expected (timeline)?

In specific, do you plan to support fast-beam-search-with-lg. Because, currently, the fast-beam-search-with-lg is not supported by zipformer/streaming-decode.py. Thanks

csukuangfj commented 1 year ago

fast_beam_search depends on k2. If you want to use fast_beam_search, please use k2-fsa/sherap.

We are going to support HLG decoding in sherpa-onnx. If you want to use a lexicon or/and an n-gram LM in decoding, then you may find HLG decoding interesting.

Please also have a look at https://github.com/k2-fsa/icefall/pull/1275

uni-manjunath-ke commented 1 year ago

Sure, Thanks. Ya, we are interested in HLG. Hoping to see this HLG code at https://github.com/k2-fsa/icefall/pull/1275 merged soon and available for usage.

uni-manjunath-ke commented 1 year ago

Hi @csukuangfj , Just wanted to confirm, if the HLG code at https://github.com/k2-fsa/icefall/pull/1275 is only for CTC, or it can be used with zipformers as well. Could you please confirm. Because, we are interested in Zipformer models for now.

uni-manjunath-ke commented 1 year ago

Currently, I see in https://github.com/k2-fsa/icefall/pull/1275 that HLG support is added for ICefall, When can we expect this support to be ported for sherpa-onnx? or Is that already ported to Sherpa-onnx? Thanks

csukuangfj commented 1 year ago

Currently, I see in k2-fsa/icefall#1275 that HLG support is added for ICefall, When can we expect this support to be ported for sherpa-onnx? or Is that already ported to Sherpa-onnx? Thanks

please see https://github.com/k2-fsa/sherpa-onnx/pull/349

We will finish it in two weeks.


if the HLG code at https://github.com/k2-fsa/icefall/pull/1275 is only for CTC

Yes, you are right. It is only for CTC.


it can be used with zipformers as well

Zipformer is a kind of neural network, while ctc is a kind of loss function. They are two different things. If you train a zipformer using CTC loss, then you can use HLG decoding with zipformer. If you train a zipformer using transducer loss, then you cannot use HLG decoding with zipformer.

csukuangfj commented 1 year ago

Currently, I see in k2-fsa/icefall#1275 that HLG support is added for ICefall, When can we expect this support to be ported for sherpa-onnx? or Is that already ported to Sherpa-onnx? Thanks

@uni-manjunath-ke

https://github.com/k2-fsa/sherpa-onnx/pull/349

The C++ part is usable now.

uni-manjunath-ke commented 1 year ago

Thank you.