-
I'm not sure if this is a Yesod question or a Conduit question, so forgive me if I'm asking in the wrong place. I have a handler that builds a zip archive in a temp directory, streams the archive as t…
-
I assume that currently, the LLM App API client wrapper for OpenAI API does not support this streaming completions feature.
It is nice to have it where we can stream ChatGPT final responses into Pa…
-
Hi 👋🏻
Loving ollama always ❤️
I'm eager to use newly released structured output using ollama but it looks like ollama doesn't have compatibility yet so I can just put base_url and I'll get res…
-
Hello,
I am using spring cloud gateway mvc: 4.1.5
When making requests that expects a streaming chunked response - where each chunk is returned to the api caller as soon as a chunk is ready.
Th…
-
## Feature Request
Love the project! Just one request on response streaming.
Most (because I haven't used all of them) gRPC implementation passes a `sink/writer` handle into the handler stub. In…
-
The streaming `superPool` API is great, but in order to make it work right, you must ensure that elements continue coming through the stream at a minimum rate. If the stream slows down too much, respo…
-
See the discussions over at https://github.com/whatwg/fetch/pull/425#issuecomment-531680634
The idea is, I think, that the body of a request would be transmitted over the network, and at the same t…
-
Hi,
this an awesome project.
Would be nice to support chunked encoding for streaming outputs. So we can return content of arbitrary size without wasting memory.
Jan
-
Particularly in the case of responses streamed from a Publisher it would be handy to know the overall response time (or the time of the last-published chunk) in addition to the (currently available) t…
-
### Question Validation
- [X] I have searched both the documentation and discord for an answer.
### Question
chat_engine = index.as_chat_engine(
chat_mode="condense_plus_context",
memory=…