Closed alidabaghi123 closed 1 year ago
Yes. Of course.
Please refer to https://k2-fsa.github.io/icefall/model-export/index.html for how to export models from icefall.
Specifically, for streaming zipformer, please see
For non-streaming zipformer, please see
Also, please see examples at
thank you very much.
May I ask when the online websocket server can support zipformer? It seems that it does not support it now. @csukuangfj
May I ask when the online websocket server can support zipformer? It seems that it does not support it now. @csukuangfj
It is supported in C++ websocket server.
Please have a look at our documentation https://k2-fsa.github.io/sherpa/cpp/pretrained_models/online_transducer.html
Do we have pythonic server setup for streaming zipformer models? streaming_server.py doesnt seem to work actually with zipformer streaming models.
Thanks in advance Sagar
Do we have pythonic server setup for streaming zipformer models? streaming_server.py doesnt seem to work actually with zipformer streaming models.
Thanks in advance Sagar
Currently, no, We have C++ websocket server that performs better than its Python counterpart.
yes. i cant run for zipformer
May I ask when the online websocket server can support zipformer? It seems that it does not support it now. @csukuangfj
It is supported in C++ websocket server.
Please have a look at our documentation https://k2-fsa.github.io/sherpa/cpp/pretrained_models/online_transducer.html
@csukuangfj In sherpa-online-websocket-server
there are no args supported to load jit models. Am I missing something here?
May I ask when the online websocket server can support zipformer? It seems that it does not support it now. @csukuangfj
It is supported in C++ websocket server. Please have a look at our documentation https://k2-fsa.github.io/sherpa/cpp/pretrained_models/online_transducer.html
@csukuangfj In
sherpa-online-websocket-server
there are no args supported to load jit models. Am I missing something here?
Please run
sherpa-online-websocket-server --help
to view the help messages.
Thanks for the help, I missed the obvious part.
Also, now I am facing issue on running the online websocket server with GPU. Getting the following error:
**_terminate called after throwing an instance of 'std::runtime_error'
what(): The following operation failed in the TorchScript interpreter.
Traceback of TorchScript, serialized code (most recent call last):
File "code/torch/zipformer.py", line 2809, in forward
_85 = annotate(number, torch.add(_84, CONSTANTS.c1))
cum_mask = torch.arange(1, _85, dtype=None, layout=None, device=torch.device("cpu"), pin_memory=False)
_86 = torch.add(torch.unsqueeze(cum_mask, 1), torch.unsqueeze(cached_len4, 0))
_87 = torch.mul(torch.reciprocal(_86), CONSTANTS.c2)
pooling_mask = torch.unsqueeze(_87, 2)_**
Is this because of some jit tracing mistake we have done? We are using [jit_trace_export.py](https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/pruned_transducer_stateless7_streaming/jit_trace_export.py)
Which version of PyTorch are you using? Also, are you using our provided pre-trained streaming zipformer or are you exporting the model by yourself?
torch version is 1.13.1, I am using the standard sherpa docker itself.
On zipformer model, we have exported model ourselves after training it on icefall.
We have never seen this error before. Does it work with our pre-trained model listed in the doc?
Nope, it does not work with pretrained model as well. Please note that, i am enabling --use-gpu=true
and this is causing the issue.
Nope, it does not work with pretrained model as well. Please note that, i am enabling
--use-gpu=true
and this is causing the issue.
I see. So it works with --use-gpu=false
?
Please export a CUDA version of traced model if you want to use --use-gpu=true
.
Whats the CUDA version for pretrained librispeech model? Also, how would CUDA export making a difference here? I checked torch and cuda devices are accessible actually.
Please remove https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/pruned_transducer_stateless7_streaming/jit_trace_export.py#L290 when exporting the model to CUDA using jit trace.
Tried this, no success. Do I need to explicitly move some model onto GPU while tracing? There are multiple instances where model is being moved to device.
Tried this, no success.
Did you fail to export the model with CUDA or did you fail to run the cuda-exported model with sherpa?
Could you please post the error logs?
Failed to export the model with CUDA at the first place.
`[I] /workspace/sherpa/sherpa/cpp_api/online-recognizer.cc:403:void sherpa::OnlineRecognizer::OnlineRecognizerImp
l::WarmUp() 2023-04-12 18:13:21.070 WarmUp begins
terminate called after throwing an instance of 'std::runtime_error'
what(): The following operation failed in the TorchScript interpreter.
Traceback of TorchScript, serialized code (most recent call last):
File "code/torch/zipformer.py", line 2809, in forward
_85 = annotate(number, torch.add(_84, CONSTANTS.c1))
cum_mask = torch.arange(1, _85, dtype=None, layout=None, device=torch.device("cpu"), pin_memory=False)
_86 = torch.add(torch.unsqueeze(cum_mask, 1), torch.unsqueeze(cached_len4, 0))
_87 = torch.mul(torch.reciprocal(_86), CONSTANTS.c2)
pooling_mask = torch.unsqueeze(_87, 2)`
logs:
`/mnt/efs/dspavankumar/tools/icefall/egs/en-us/pruned_transducer_stateless7_streaming/jit_trace_export_gpu.py(104
): <module>
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!`
x = torch.zeros(1, T, 80, dtype=torch.float32)
x_lens = torch.full((1,), T, dtype=torch.int32)
states = encoder_model.get_init_state(device=x.device)
to CUDA.
(I thought you were able to fix them by yourself).
Failed to export the model with CUDA at the first place.
`[I] /workspace/sherpa/sherpa/cpp_api/online-recognizer.cc:403:void sherpa::OnlineRecognizer::OnlineRecognizerImp l::WarmUp() 2023-04-12 18:13:21.070 WarmUp begins terminate called after throwing an instance of 'std::runtime_error'
what(): The following operation failed in the TorchScript interpreter. Traceback of TorchScript, serialized code (most recent call last): File "code/torch/zipformer.py", line 2809, in forward _85 = annotate(number, torch.add(_84, CONSTANTS.c1)) cum_mask = torch.arange(1, _85, dtype=None, layout=None, device=torch.device("cpu"), pin_memory=False) _86 = torch.add(torch.unsqueeze(cum_mask, 1), torch.unsqueeze(cached_len4, 0))
~~~~~ <---HERE _87 = torch.mul(torch.reciprocal(_86), CONSTANTS.c2) pooling_mask = torch.unsqueeze(_87, 2)`logs:
/mnt/efs/dspavankumar/tools/icefall/egs/en-us/pruned_transducer_stateless7_streaming/jit_trace_export_gpu.py(104 ): <module> RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
Please use latest export.py
from icefall, i.e., changes from
https://github.com/k2-fsa/icefall/pull/1005
We now support passing cpu_jit.pt
. Please see
also https://github.com/k2-fsa/sherpa/pull/365
and our updated doc
https://k2-fsa.github.io/sherpa/cpp/pretrained_models/online_transducer.html#icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29
Closing via #365
hello. Thanks for your efforts. do you support zipformer in sherpa-framework? i can export zipformer in sherpa-onnx but cannot export to sherpa-framework