Closed tiefucai closed 1 month ago
Hi @tiefucai We do have a streaming API which we have tested only with LLM. https://github.com/pytorch/serve/blob/96450b9d0ab2a7290221f0e07aea5fda8a83efaf/docs/inference_api.md#curl-example-1
Could you please try this with ASR and if it works, can you please send a PR
Hi @tiefucai for your use case the streaming API with stateful inference should be applicable as you want to send a stream/chunks of data to the same model. See this example for details and let me know if you need any help with this. The recent SAM2 demo on meta.ai used this for their video segmentation.
Will close this issue as we have support for input streams of data to the same model. Feel free to reopen or join our Slack channel for further questions.
🚀 The feature
I saw ISSUE#2447 asking for websocket support and that issue has been tagged as enhancement. I wondering if TorchServe support websocket now?
Motivation, pitch
Using Torch models for ASR (as an example) will require to send many http request with chunks of audio (for real time streaming). With websockets, the connection would be persistent, chunks would be processed and replies are sent to the client
Alternatives
No response
Additional context
No response