substratusai / lingo

Lightweight ML model proxy and autoscaler for kubernetes
https://www.substratus.ai
Apache License 2.0
102 stars 6 forks source link

Support Streaming #56

Closed nstogner closed 4 months ago

nstogner commented 6 months ago

Lingo should work with streaming. We should check to see if it does today and if not, update lingo. Also requires validating that our example backends support streaming.

https://platform.openai.com/docs/api-reference/streaming

samos123 commented 6 months ago

vLLM supports streaming with the openai compatible backend that we serve. I can't remember if I tested this already or not, but agree we should test and ensure it works. Ideally we write a system test case for it?

samos123 commented 4 months ago

I tested this a while ago, it works :partying_face: