k2-fsa / sherpa-onnx

Speech-to-text, text-to-speech, speaker recognition, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift, Dart, JavaScript, Flutter, Object Pascal, Lazarus, Rust
https://k2-fsa.github.io/sherpa/onnx/index.html
Apache License 2.0
3.14k stars 366 forks source link

Hi! is there any example implementation of streaming for this model: https://huggingface.co/marcoyang/icefall-libri-giga-pruned-transducer-stateless7-streaming-2023-04-04 #178

Closed Caet-pip closed 1 year ago

Caet-pip commented 1 year ago

Hi I saw this model trained for streaming https://huggingface.co/marcoyang/icefall-libri-giga-pruned-transducer-stateless7-streaming-2023-04-04 is there any example implementation or is it the same as implementing the sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20 model.

Are the steps used for this reproducible for the icefall libra giga model

csukuangfj commented 1 year ago

Please see https://github.com/k2-fsa/icefall/pull/984 if you want to train it by yourself.

If you want to use it in sherpa-onnx, please follow https://k2-fsa.github.io/icefall/model-export/export-onnx.html to export the model. You can find export-onnx.py from https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/pruned_transducer_stateless7_streaming/export-onnx.py

Caet-pip commented 1 year ago

Thanks a lot! Can this be used for streaming purpose by using a microphone in real time like sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20 model.

I had previously tried the icefall-asr-multidataset-pruned_transducer_stateless7-2023-05-04 for streaming but its output is not real time streaming and requires input to start and stop the recording, the output is really good however.

Please let me know if I am approaching this in the right way, thanks in advance!

csukuangfj commented 1 year ago

Can this be used for streaming purpose by using a microphone in real time like sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20 model.

Yes, you can.

The URL name contains streaming, so you can use it for streaming purpose.

Caet-pip commented 1 year ago

Okay, I had a used the icefall-asr-multidataset-pruned_transducer_stateless7-2023-05-04 with the speech-recognition-from-microphone.py in sherpa-onnx for real time streaming transcription without the need for pressing enter to start/stop record. When I replace the encoder decoder and joiner with the miltidataset model I get the following error

(base) fawazahamedshaik@Fawazs-MacBook-Pro sherpa-onnx % python3 ./python-api-examples/speech-recognition-from-microphone.py \ --tokens=./icefall-asr-multidataset-pruned_transducer_stateless7-2023-05-04/data/lang_bpe_500/tokens.txt \ --encoder=./icefall-asr-multidataset-pruned_transducer_stateless7-2023-05-04/exp/encoder-epoch-30-avg-4.onnx \ --decoder=./icefall-asr-multidataset-pruned_transducer_stateless7-2023-05-04/exp/decoder-epoch-30-avg-4.onnx \ --joiner=./icefall-asr-multidataset-pruned_transducer_stateless7-2023-05-04/exp/joiner-epoch-30-avg-4.onnx \

0 MacBook Pro Microphone, Core Audio (1 in, 0 out) < 1 MacBook Pro Speakers, Core Audio (0 in, 2 out) Use default device: MacBook Pro Microphone /Users/runner/work/sherpa-onnx/sherpa-onnx/sherpa-onnx/csrc/online-zipformer-transducer-model.cc:InitEncoder:99 encoder_dims does not exist in the metadata

Is this because the model was not made for real-time streaming inference?

csukuangfj commented 1 year ago

Is this because the model was not made for real-time streaming inference?

The reason is that you didn't export the model correctly.

Could you describe how you exported the model in detail?

Caet-pip commented 1 year ago

I am using this model https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-transducer/zipformer-transducer-models.html#icefall-asr-multidataset-pruned-transducer-stateless7-2023-05-04-englis

This is already exported to be used with ONNX right?

I wanted to use it with speech-recognition-from-microphone.py python file so i replaced the code given in the example

cd /path/to/sherpa-onnx

python3 ./python-api-examples/speech-recognition-from-microphone.py --tokens=./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt --encoder=./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/encoder-epoch-99-avg-1.onnx --decoder=./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/decoder-epoch-99-avg-1.onnx --joiner=./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/joiner-epoch-99-avg-1.onnx

with the icefall code

(base) fawazahamedshaik@Fawazs-MacBook-Pro sherpa-onnx % python3 ./python-api-examples/speech-recognition-from-microphone.py \ --tokens=./icefall-asr-multidataset-pruned_transducer_stateless7-2023-05-04/data/lang_bpe_500/tokens.txt \ --encoder=./icefall-asr-multidataset-pruned_transducer_stateless7-2023-05-04/exp/encoder-epoch-30-avg-4.onnx \ --decoder=./icefall-asr-multidataset-pruned_transducer_stateless7-2023-05-04/exp/decoder-epoch-30-avg-4.onnx \ --joiner=./icefall-asr-multidataset-pruned_transducer_stateless7-2023-05-04/exp/joiner-epoch-30-avg-4.onnx \

I exported the model as described in the website

GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/yfyeung/icefall-asr-multidataset-pruned_transducer_stateless7-2023-05-04 cd icefall-asr-multidataset-pruned_transducer_stateless7-2023-05-04/exp git lfs pull --include "*.onnx"

But I want to use the model with the speech-recognition-from-microphone.py file for real time ASR without having to press 'enter' key

The icefall model is a Transducer model right, I wanted to use this with speech-recognition-from-microphone.py

csukuangfj commented 1 year ago

But I want to use the model with the speech-recognition-from-microphone.py file for real time ASR without having to press 'enter' key

Previously, you were asking

Hi I saw this model trained for streaming https://huggingface.co/marcoyang/icefall-libri-giga-pruned-transducer-stateless7-streaming-2023-04-04 is there any example implementation

so I thought you want to use it for streaming recognition. I replied with yes

Yes, you can. The URL name contains streaming, so you can use it for streaming purpose.

But now you are switching to a different model:

https://huggingface.co/yfyeung/icefall-asr-multidataset-pruned_transducer_stateless7-2023-05-04

The above model is not a streaming model (there is no streaming in the URL, so it is a non-streaming model); thus you cannot use it for streaming purpose.

csukuangfj commented 1 year ago

I am using this model https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-transducer/zipformer-transducer-models.html#icefall-asr-multidataset-pruned-transducer-stateless7-2023-05-04-englis

There is an offline in the URL, which means it is for non-streaming.

Caet-pip commented 1 year ago

Okay got it, sorry for the miscommunication. Thanks for clearing it up.

I wanted a model for real time streaming ASR and the icefall-asr-multidataset model was very good hence I asked about that

As you said that the icefall-libri-giga-pruned-transducer-stateless7-streaming model is for streaming I will use it for my purpose. But as the model hasn't been exported as onnx I wanted to know if the other icefall model could be used.

I was looking at ways to export the icefall-libra-giga-pruned model in onnx but wanted a solution in the meantime hence I started looking at other models. My bad for not noticing the offline.

Could you let me know any onnx model which is as good as the icefall model which I can use for real time streaming immediately. Thanks!

csukuangfj commented 1 year ago

As you said that the icefall-libri-giga-pruned-transducer-stateless7-streaming model is for streaming I will use it for my purpose. But as the model hasn't been exported as onnx

Then please export it by yourself by following https://github.com/k2-fsa/sherpa-onnx/issues/178#issuecomment-1595996343

If you have any issues during export, we can help you.

Caet-pip commented 1 year ago

Okay sure, thanks I will do that

I wanted to know, do I need to install Icefall for exporting model to ONNX?

csukuangfj commented 1 year ago

Okay sure, thanks I will do that

I wanted to know, do I need to install Icefall for exporting model to ONNX?

Yes, please following icefall installation doc to setup the environment.

Caet-pip commented 1 year ago

Do I need to follow step (0) Install CUDA toolkit and cuDNN for running icefall on MacOS ARM chip.

csukuangfj commented 1 year ago

No, you don't have to.

You can install a cpu version of PyTorch and k2 for exporting models from icefall.

Caet-pip commented 1 year ago

I followed the steps to install icefall, first I installed PyTorch CPU then k2 using requirements.txt and lhoste all in a virtual environment, but after installing icefall with requirements and trying to run test I get this error

(test-icefall) (base) fawazahamedshaik@Fawazs-MacBook-Pro ASR % ./prepare.sh 2023-06-19 13:47:01 (prepare.sh:27:main) dl_dir: /Users/fawazahamedshaik/icefall/egs/yesno/ASR/download 2023-06-19 13:47:01 (prepare.sh:30:main) Stage 0: Download data /Users/fawazahamedshaik/icefall/egs/yesno/ASR/download/waves_yesno.tar.gz: 100%|██████████████████████████████████████████████| 4.70M/4.70M [00:01<00:00, 2.42MB/s] 2023-06-19 13:47:13 (prepare.sh:39:main) Stage 1: Prepare yesno manifest 2023-06-19 13:47:14 (prepare.sh:45:main) Stage 2: Compute fbank for yesno Traceback (most recent call last): File "/Users/fawazahamedshaik/icefall/egs/yesno/ASR/./local/compute_fbank_yesno.py", line 18, in from icefall.utils import get_executor ModuleNotFoundError: No module named 'icefall'

danpovey commented 1 year ago

you are probably inside some virtual environment where icefall is not installed? anyway it is not really necessary to install, you can just do, in your case: export PYTHONPATH=$PYTHONPATH:/Users/fawazahamedshaik/icefall (or put it in your .bashrc or edit your virtual environment activate.sh or whatever). which is more convenient anyway in case you change the code or change the branch, no installation needed.

On Mon, Jun 19, 2023 at 10:50 AM CaetLearn @.***> wrote:

I followed the steps to install icefall, first I installed PyTorch CPU then k2 using requirements.txt and lhoste all in a virtual environment, but after installing icefall with requirements and trying to run test I get this error

(test-icefall) (base) @.*** ASR % ./prepare.sh 2023-06-19 13:47:01 (prepare.sh:27:main) dl_dir: /Users/fawazahamedshaik/icefall/egs/yesno/ASR/download 2023-06-19 13:47:01 (prepare.sh:30:main) Stage 0: Download data /Users/fawazahamedshaik/icefall/egs/yesno/ASR/download/waves_yesno.tar.gz: 100%|██████████████████████████████████████████████| 4.70M/4.70M [00:01<00:00, 2.42MB/s] 2023-06-19 13:47:13 (prepare.sh:39:main) Stage 1: Prepare yesno manifest 2023-06-19 13:47:14 (prepare.sh:45:main) Stage 2: Compute fbank for yesno Traceback (most recent call last): File "/Users/fawazahamedshaik/icefall/egs/yesno/ASR/./local/compute_fbank_yesno.py", line 18, in from icefall.utils import get_executor ModuleNotFoundError: No module named 'icefall'

— Reply to this email directly, view it on GitHub https://github.com/k2-fsa/sherpa-onnx/issues/178#issuecomment-1597546458, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAZFLO4C32FHQOM53MX2UB3XMCGQDANCNFSM6AAAAAAZKV4FWM . You are receiving this because you are subscribed to this thread.Message ID: @.***>

Caet-pip commented 1 year ago

Hello, I did the export PYTHONPATH=$PYTHONPATH:/Users/fawazahamedshaik/icefall and this solved the not installed issue but I am getting a new issue when running ./prepare.sh

logs: (test-icefall) (base) fawazahamedshaik@Fawazs-MacBook-Pro ASR % export PYTHONPATH=$PYTHONPATH:/Users/fawazahamedshaik/icefall (test-icefall) (base) fawazahamedshaik@Fawazs-MacBook-Pro ASR % ./prepare.sh
2023-06-19 15:12:54 (prepare.sh:27:main) dl_dir: /Users/fawazahamedshaik/icefall/egs/yesno/ASR/download 2023-06-19 15:12:54 (prepare.sh:30:main) Stage 0: Download data 2023-06-19 15:12:54 (prepare.sh:39:main) Stage 1: Prepare yesno manifest 2023-06-19 15:12:57 (prepare.sh:45:main) Stage 2: Compute fbank for yesno Traceback (most recent call last): File "/Users/fawazahamedshaik/icefall/egs/yesno/ASR/./local/compute_fbank_yesno.py", line 18, in from icefall.utils import get_executor File "/Users/fawazahamedshaik/icefall/icefall/init.py", line 3, in from . import ( File "/Users/fawazahamedshaik/icefall/icefall/decode.py", line 20, in import k2 File "/Users/fawazahamedshaik/test-icefall/lib/python3.9/site-packages/k2-1.24.3.dev20230619+cpu.torch2.0.1-py3.9-macosx-10.9-x86_64.egg/k2/init.py", line 23, in from _k2 import DeterminizeWeightPushingType ImportError: dlopen(/Users/fawazahamedshaik/test-icefall/lib/python3.9/site-packages/k2-1.24.3.dev20230619+cpu.torch2.0.1-py3.9-macosx-10.9-x86_64.egg/_k2.cpython-39-darwin.so, 0x0002): Symbol not found: __ZN2at4_ops10select_int4callERKNS_6TensorExx Referenced from: /Users/fawazahamedshaik/test-icefall/lib/python3.9/site-packages/k2-1.24.3.dev20230619+cpu.torch2.0.1-py3.9-macosx-10.9-x86_64.egg/_k2.cpython-39-darwin.so Expected in: <89972BE7-3028-34DA-B561-E66870D59767> /Users/fawazahamedshaik/test-icefall/lib/python3.9/site-packages/torch/lib/libtorch_cpu.dylib

Is this related to PyTorch?

csukuangfj commented 1 year ago

What is the output of

python3 -m torch.utils.collect_env

Caet-pip commented 1 year ago

This is the output

(test-icefall) (base) fawazahamedshaik@Fawazs-MacBook-Pro ASR % python -m torch.utils.collect_env Collecting environment information... PyTorch version: 2.0.1 Is debug build: False CUDA used to build PyTorch: None ROCM used to build PyTorch: N/A

OS: macOS 13.3.1 (x86_64) GCC version: Could not collect Clang version: 14.0.3 (clang-1403.0.22.14.1) CMake version: version 3.26.4 Libc version: N/A

Python version: 3.9.13 (main, Aug 25 2022, 18:29:29) [Clang 12.0.0 ] (64-bit runtime) Python platform: macOS-10.16-x86_64-i386-64bit Is CUDA available: False CUDA runtime version: No CUDA CUDA_MODULE_LOADING set to: N/A GPU models and configuration: No CUDA Nvidia driver version: No CUDA cuDNN version: No CUDA HIP runtime version: N/A MIOpen runtime version: N/A Is XNNPACK available: True

CPU: Apple M2

Versions of relevant libraries: [pip3] k2==1.24.3.dev20230619+cpu.torch2.0.1 [pip3] numpy==1.22.4 [pip3] torch==2.0.1 [pip3] torchaudio==2.0.2 [pip3] torchvision==0.15.2 [conda] k2 1.24.3.dev20230619+cpu.torch1.13.1 pypi_0 pypi [conda] mkl 2023.1.0 h59209a4_43558
[conda] mkl-service 2.4.0 py39h6c40b1e_1
[conda] numpy 1.22.4 pypi_0 pypi [conda] numpydoc 1.5.0 py39hecd8cb5_0
[conda] pytorch 1.13.1 cpu_py39h9e40b02_0
[conda] pytorch-lightning 1.9.4 pypi_0 pypi [conda] pytorch-wpe 0.0.1 pypi_0 pypi [conda] torch-complex 0.4.3 pypi_0 pypi [conda] torchaudio 0.13.1 pypi_0 pypi [conda] torchmetrics 0.11.4 pypi_0 pypi [conda] torchvision 0.14.0 pypi_0 pypi

csukuangfj commented 1 year ago

Please read the output carefully.

You have installed two versions of k2, each of which is compiled with a different version of PyTorch, i.e., torch 1.13.1 and torch 2.0.1.

Please don't do that.

Please make sure there is only one version of k2 in your current environment.

csukuangfj commented 1 year ago

I suggest that if you're not familiar with conda, please switch to pip install and don't use conda install. Most if not all users who are having issues are using conda.

Caet-pip commented 1 year ago

Okay, I will uninstall the one installed in conda, I cannot see icefall in the installed libraries is that fine?

Caet-pip commented 1 year ago

I followed installation guide in Icefall website and only used pip, I also deactivated the conda base env still when I run ./prepare.sh

(test-icefall) fawazahamedshaik@Fawazs-MacBook-Pro ASR % ./prepare.sh 2023-06-19 22:28:06 (prepare.sh:27:main) dl_dir: /Users/fawazahamedshaik/icefall/egs/yesno/ASR/download 2023-06-19 22:28:06 (prepare.sh:30:main) Stage 0: Download data 2023-06-19 22:28:06 (prepare.sh:39:main) Stage 1: Prepare yesno manifest 2023-06-19 22:28:09 (prepare.sh:45:main) Stage 2: Compute fbank for yesno Traceback (most recent call last): File "/Users/fawazahamedshaik/icefall/egs/yesno/ASR/./local/compute_fbank_yesno.py", line 18, in from icefall.utils import get_executor File "/Users/fawazahamedshaik/icefall/icefall/init.py", line 3, in from . import ( File "/Users/fawazahamedshaik/icefall/icefall/decode.py", line 20, in import k2 File "/Users/fawazahamedshaik/test-icefall/lib/python3.9/site-packages/k2-1.24.3.dev20230619+cpu.torch2.0.1-py3.9-macosx-10.9-x86_64.egg/k2/init.py", line 23, in from _k2 import DeterminizeWeightPushingType ImportError: dlopen(/Users/fawazahamedshaik/test-icefall/lib/python3.9/site-packages/k2-1.24.3.dev20230619+cpu.torch2.0.1-py3.9-macosx-10.9-x86_64.egg/_k2.cpython-39-darwin.so, 0x0002): Symbol not found: __ZN2at4_ops10select_int4callERKNS_6TensorExx Referenced from: /Users/fawazahamedshaik/test-icefall/lib/python3.9/site-packages/k2-1.24.3.dev20230619+cpu.torch2.0.1-py3.9-macosx-10.9-x86_64.egg/_k2.cpython-39-darwin.so Expected in: <89972BE7-3028-34DA-B561-E66870D59767> /Users/fawazahamedshaik/test-icefall/lib/python3.9/site-packages/torch/lib/libtorch_cpu.dylib

One thing I noticed is, when installing k2 with I notice this in the logs:

(test-icefall) fawazahamedshaik@Fawazs-MacBook-Pro k2 % export K2_MAKE_ARGS="-j6" python3 setup.py install

CMake Warning (dev) at /Users/fawazahamedshaik/opt/anaconda3/lib/python3.9/site-packages/torch/share/cmake/Caffe2/public/mkl.cmake:1 (find_package): Policy CMP0074 is not set: find_package uses _ROOT variables. Run "cmake --help-policy CMP0074" for policy details. Use the cmake_policy command to set the policy and suppress this warning.

CMake variable MKL_ROOT is set to:

/Users/fawazahamedshaik/opt/anaconda3

For compatibility, CMake is ignoring the variable.

I assume the problem is with CMake

when I run python -m torch.utils.collect_env I get: (test-icefall) fawazahamedshaik@Fawazs-MacBook-Pro k2 % python -m torch.utils.collect_env
Collecting environment information... PyTorch version: 2.0.1 Is debug build: False CUDA used to build PyTorch: None ROCM used to build PyTorch: N/A

OS: macOS 13.3.1 (x86_64) GCC version: Could not collect Clang version: 14.0.3 (clang-1403.0.22.14.1) CMake version: version 3.26.4 Libc version: N/A

Python version: 3.9.13 (main, Aug 25 2022, 18:29:29) [Clang 12.0.0 ] (64-bit runtime) Python platform: macOS-10.16-x86_64-i386-64bit Is CUDA available: False CUDA runtime version: No CUDA CUDA_MODULE_LOADING set to: N/A GPU models and configuration: No CUDA Nvidia driver version: No CUDA cuDNN version: No CUDA HIP runtime version: N/A MIOpen runtime version: N/A Is XNNPACK available: True

CPU: Apple M2

Versions of relevant libraries: [pip3] k2==1.24.3.dev20230620+cpu.torch2.0.1 [pip3] numpy==1.22.4 [pip3] torch==2.0.1 [pip3] torchaudio==2.0.2 [pip3] torchvision==0.15.2 [conda] mkl 2023.1.0 h59209a4_43558
[conda] mkl-service 2.4.0 py39h6c40b1e_1
[conda] numpy 1.22.4 pypi_0 pypi [conda] numpy-base 1.22.3 py39he782bc1_0
[conda] numpydoc 1.5.0 py39hecd8cb5_0
[conda] pytorch 1.13.1 cpu_py39h9e40b02_0
[conda] pytorch-lightning 1.9.4 pypi_0 pypi [conda] pytorch-wpe 0.0.1 pypi_0 pypi [conda] torch-complex 0.4.3 pypi_0 pypi [conda] torchaudio 0.13.1 pypi_0 pypi [conda] torchmetrics 0.11.4 pypi_0 pypi [conda] torchvision 0.14.0 pypi_0 pypi

Here I notice that mkl and mcl-service is both installed in conda, is it because of this that I am getting error?

Sorry for the repeated error, I want to know what I am doing wrong in install process

csukuangfj commented 1 year ago

Here I notice that mkl and mcl-service is both installed in conda, is it because of this that I am getting error?

Please make sure you have deactivated conda completely.

There are multiple versions of PyTorch in your current environment, i.e.,

[pip3] torch==2.0.1
[conda] pytorch 1.13.1 cpu_py39h9e40b02_0

Please don't do that.


Please see https://github.com/k2-fsa/sherpa-onnx/issues/178#issuecomment-1597926968

I suggest that if you're not familiar with conda, please switch to pip install and don't use conda install. Most if not all users who are having issues are using conda.

Make sure you only have one version of PyTorch in your current environment.

Caet-pip commented 1 year ago

Hello, I was able to export the model in the demo

https://k2-fsa.github.io/icefall/model-export/export-onnx.html#export-the-model-to-onnx

finally it says that the exported files are in: It will generate the following 3 files in $repo/exp

encoder-epoch-99-avg-1.onnx decoder-epoch-99-avg-1.onnx joiner-epoch-99-avg-1.onnx

but I cannot find this directory nor the files.

csukuangfj commented 1 year ago

Could you post the last few lines of the export log or post the export command you are using?

The exported files are in the exp directory you specified.

Caet-pip commented 1 year ago

Okay, I followed the guide and exported as per the lines in the guide

It will generate the following 3 files in $repo/exp

this is my code: (my_env) fawazahamedshaik@Fawazs-MacBook-Pro ASR % ./pruned_transducer_stateless7_streaming/export-onnx.py \ --bpe-model $repo/data/lang_bpe_500/bpe.model \ --use-averaged-model 0 \ --epoch 99 \ --avg 1 \ --decode-chunk-len 32 \ --exp-dir $repo/exp/

So it must be in $repo/exp Can I change this to another directory?

when I use my directory I get this error (my_env) fawazahamedshaik@Fawazs-MacBook-Pro ASR % ./pruned_transducer_stateless7_streaming/export-onnx.py \ --bpe-model $repo/data/lang_bpe_500/bpe.model \ --use-averaged-model 0 \ --epoch 99 \ --avg 1 \ --decode-chunk-len 32 \ --exp-dir $Users/fawazahamedshaik/icefall Traceback (most recent call last): File "/Users/fawazahamedshaik/icefall/egs/librispeech/ASR/./pruned_transducer_stateless7_streaming/export-onnx.py", line 669, in main() File "/Users/fawazahamedshaik/my_env/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "/Users/fawazahamedshaik/icefall/egs/librispeech/ASR/./pruned_transducer_stateless7_streaming/export-onnx.py", line 480, in main setup_logger(f"{params.exp_dir}/log-export/log-export-onnx") File "/Users/fawazahamedshaik/icefall/icefall/utils.py", line 138, in setup_logger os.makedirs(os.path.dirname(log_filename), exist_ok=True) File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/os.py", line 215, in makedirs makedirs(head, exist_ok=exist_ok) File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/os.py", line 215, in makedirs makedirs(head, exist_ok=exist_ok) File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/os.py", line 225, in makedirs mkdir(name, mode) OSError: [Errno 30] Read-only file system: '/fawazahamedshaik'

Can you also guide me on how to use this in sherpa-onyx, like how I should modify the code to run it in sherpa-onnx

csukuangfj commented 1 year ago
this is my code:
(my_env) fawazahamedshaik@Fawazs-MacBook-Pro ASR % ./pruned_transducer_stateless7_streaming/export-onnx.py \
--bpe-model $repo/data/lang_bpe_500/bpe.model

What is the value of $repo?


Okay, I followed the guide and exported as per the lines in the guide

Could you please post the link to the guide you are following? Make sure you don't miss any command in the guide.

Caet-pip commented 1 year ago

I followed the guide in documentation, link: https://k2-fsa.github.io/icefall/model-export/export-onnx.html#export-the-model-to-onnx

$repo gives me (my_env) fawazahamedshaik@Fawazs-MacBook-Pro ASR % $repo
zsh: command not found: icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29

I found the files in the cloned repo in exp in this folder icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29

But there are 6 files generated here is the photo

Screenshot 2023-06-20 at 10 54 55 PM

Which files should I use, and can you explain how I can use them in sherpa-onnx

csukuangfj commented 1 year ago
$repo gives me
(my_env) fawazahamedshaik@Fawazs-MacBook-Pro ASR % $repo

Please show the output of

echo $repo
csukuangfj commented 1 year ago

Which files should I use, and can you explain how I can use them in sherpa-onnx

So you have managed to find the generated files, congratulations!

Please refer to https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#csukuangfj-sherpa-onnx-streaming-zipformer-en-2023-02-21-english

Screenshot 2023-06-21 at 11 03 21

Caet-pip commented 1 year ago

Do I need to move these files to sherpa-onnx directory?

the generated files are currently in the icefall directory

and I am planning to export the https://huggingface.co/marcoyang/icefall-libri-giga-pruned-transducer-stateless7-streaming-2023-04-04 next, will the output be the same?

I wanted to know the syntax of the code I need to create with the generated files, as I can see the encoder, decoder and joiner are present in the syntax, do I need to replace them with the generated onnx files?

csukuangfj commented 1 year ago

do I need to replace them with the generated onnx files?

Yes, please use absolute path names if you are not sure.

Do I need to move these files to sherpa-onnx directory?

No, you don't need to do that. You can place it anywhere as long as you pass the correct path to ./build/bin/sherpa-onnx

csukuangfj commented 1 year ago

and I am planning to export the https://huggingface.co/marcoyang/icefall-libri-giga-pruned-transducer-stateless7-streaming-2023-04-04 next, will the output be the same?

Yes, it should be the same.

csukuangfj commented 1 year ago

@Caet-pip

I have exported the model to ONNX. Please see https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#csukuangfj-sherpa-onnx-streaming-zipformer-en-2023-06-21-english

Screenshot 2023-06-21 at 15 49 40

Caet-pip commented 1 year ago

Thank you so much!

I was also looking forward to use my exported model so I will try that as well...I have a problem when running the model it says file not found but the files are in the directory

code for microphone air: (my_env) fawazahamedshaik@Fawazs-MacBook-Pro sherpa-onnx % ./build/bin/sherpa-onnx-microphone \ --tokens=./icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/exp/tokens.txt \ --encoder=./icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/exp/encoder-epoch-99-avg-1.onnx \ --decoder=./icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/exp/decoder-epoch-99-avg-1.onnx \ --joiner=./icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/exp/joiner-epoch-99-avg-1.onnx \

logs with error: OnlineRecognizerConfig(feat_config=FeatureExtractorConfig(sampling_rate=16000, feature_dim=80), model_config=OnlineTransducerModelConfig(encoder_filename="--encoder=./icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/exp/encoder-epoch-99-avg-1.onnx", decoder_filename="--decoder=./icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/exp/decoder-epoch-99-avg-1.onnx", joiner_filename="--joiner=./icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/exp/joiner-epoch-99-avg-1.onnx", tokens="--tokens=./icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/exp/tokens.txt", num_threads=2, provider="cpu", debug=False), lm_config=OnlineLMConfig(model="", scale=0.5), endpoint_config=EndpointConfig(rule1=EndpointRule(must_contain_nonsilence=False, min_trailing_silence=2.4, min_utterance_length=0), rule2=EndpointRule(must_contain_nonsilence=True, min_trailing_silence=1.2, min_utterance_length=0), rule3=EndpointRule(must_contain_nonsilence=False, min_trailing_silence=0, min_utterance_length=300)), enable_endpoint=True, max_active_paths=4, decoding_method="greedy_search") /Users/fawazahamedshaik/sherpa-onnx/sherpa-onnx/csrc/online-transducer-model-config.cc:Validate:29 --tokens=./icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/exp/tokens.txt does not exist Errors in config!

checking if files are present in the dir: (my_env) fawazahamedshaik@Fawazs-MacBook-Pro sherpa-onnx % cd ./icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/exp/ (my_env) fawazahamedshaik@Fawazs-MacBook-Pro exp % ls cpu_jit.pt epoch-30.pt joiner_jit_trace.pt decode.sh epoch-99.pt log decoder-epoch-99-avg-1.int8.onnx export.sh pretrained.pt decoder-epoch-99-avg-1.onnx jit_pretrained.sh pretrained.sh decoder_jit_trace.pt jit_trace_export.sh tensorboard encoder-epoch-99-avg-1.int8.onnx jit_trace_pretrained.sh tokens.txt encoder-epoch-99-avg-1.onnx joiner-epoch-99-avg-1.int8.onnx train.sh encoder_jit_trace.pt joiner-epoch-99-avg-1.onnx

as you can see tokens.txt is present in the given dir but it says missing.

csukuangfj commented 1 year ago
(my_env) fawazahamedshaik@Fawazs-MacBook-Pro sherpa-onnx  ls -lh ./icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/exp/tokens.txt

What does it output?

Caet-pip commented 1 year ago

This is the output

(my_env) fawazahamedshaik@Fawazs-MacBook-Pro sherpa-onnx % ls -lh ./icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/exp/tokens.txt -rw-r--r-- 1 fawazahamedshaik staff 4.9K Jun 19 05:47 ./icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/exp/tokens.txt

Caet-pip commented 1 year ago

I got this tokens.txt file in icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/data/lang-bpe-500 and I moved it to the main directory along with the decoder encoder and joiner files

csukuangfj commented 1 year ago

I see.

Please run

./build/bin/sherpa-onnx-microphone

and read the output carefully.

You don't need to use --tokens, --endcoder, etc. Please pass the paths directly.

Caet-pip commented 1 year ago

This worked Thanks a lot!!!

And thank you so much for exporting the giga libri pruned transducer model! I was planning to do that next.

I am planning to use other models, namely Nvidia NeMo streaming models (I'm not sure if they have those but ill look) for streaming by converting them to ONNX next Hopefully that goes well.

Thanks again!

csukuangfj commented 1 year ago

I am planning to use other models, namely Nvidia NeMo streaming models

Is there a link to the NeMo streaming model? Is it trained by CTC or transducer loss?

Caet-pip commented 1 year ago

I am checking but so far I see models which only decode audio files (wav files)

Caet-pip commented 1 year ago

It appears to be that they are using conformer models for cache aware streaming as per this link

https://github.com/NVIDIA/NeMo/blob/main/examples/asr/asr_cache_aware_streaming/speech_to_text_cache_aware_streaming_infer.py

they also have Buffered ASR/Chunked Inference for both conformer and rnnt models in this link but I do not think this is streaming

https://github.com/NVIDIA/NeMo/tree/main/examples/asr/asr_chunked_inference

Caet-pip commented 1 year ago

After some digging I found that QuartzNet15x5Base-En model which is a EncDecCTCModel can be implemented for streaming ASR with mic as per their demo notebook: https://github.com/NVIDIA/NeMo/blob/stable/tutorials/asr/Online_ASR_Microphone_Demo.ipynb

Model details are in: https://catalog.ngc.nvidia.com/orgs/nvidia/models/nemospeechmodels

I think other NeMo EncDecCTCModels can be used for streaming ASR, do you think these can be exporting ed to ONNX and used in sherpa-onnx?

csukuangfj commented 1 year ago

If you can find a way to export it to ONNX, we can change sherpa-onnx to support that.

Support for streaming ctc models is on the plan.