Closed lionsheep0724 closed 8 months ago
Hi do you have any news on this?
Triton Inference Server offers robust features for handling inference requests, and while it excels in certain areas, there are nuances to consider when dealing with specific use cases like streaming data. Let's clarify these aspects:
Streaming Data vs. Batching:
ASR Specifics:
Decoupled Models and Streaming Outputs:
Sequence Handling:
Architecture Considerations:
Internal Batching in Decoupled Models:
In conclusion, while Triton Inference Server provides robust features for batch processing and can handle decoupled model outputs effectively, integrating it with streaming data sources like ASR might require careful consideration and potentially custom application logic. It excels in scenarios with numerous small inference requests but might not be the most efficient for continuous, real-time data streams. As always, the best approach depends on the specific requirements and constraints of your use case.
This issue is stale because it has been open 21 days with no activity. Remove stale label or comment or this will be closed in 7 days.
This issue was closed because it has been stalled for 7 days with no activity.
I have some questions about decoupled model in 0.5.0. Its documentation says that it is specifically useful in Automated Speech Recognition (ASR), but I don't understand why. Here's my questions.