Open carlos-verdes opened 1 year ago
This doesn't answer your question, but have you considered using the newer Streams API instead of chunked transfer encoding? https://developer.mozilla.org/en-US/docs/Web/API/Streams_API
You might also consider using a service worker to fix this problem more immediately than htmx might get around to it - it'll intercept all network requests and you could deal with it as needed (as well as do LOTS more)
I use that API to read from the server when doing SPA development, that's why I'm asking if HTMX will support this feature or not
The good thing about it is you don't need to change the protocol and most of HTTP and actually HTMX just works, the only problem is behavior is not as expected (HTMX waits for full response to be sent before rendering anything back to the user)
I really need this feature (http streaming).
It would be lovely to see this supported
+1
Example use case: Streaming search results. Instead of complex "infinite scroll" or other types of convoluted result delivery schemes, we could just start pushing results to the client as soon as we get first hits from the database - client could start loading and rendering images etc for the first entries in listing right away. Ohh, it would be so straightforward and beautiful, so old school in the bestest of ways.
I send all collections from my backend using streams to avoid also memory pressure on big collections so for me is a natural thing to do
Had to close the above PR but I think the existing extension mechanism should be more than sufficient to implement this. Would love if someone wanted to take that on.
@alexpetros I'm on it ;)
Are you doing extension for this @douglasduteil ? Can you share link to the PR when ready?
\to @carlos-verdes it's christmas time again 🎄
$ npm install htmx.ext...chunked-transfer
<script src="https://unpkg.com/htmx.ext...chunked-transfer/dist/index.js"></script>
<body hx-ext="chunked-transfer">
...
</body>
:warning: It's a very early version that I'm not using myself
I don't know if there is a plan to add this into the base of HTMX or rely on extensions but I thought this is an example of the functionality folks are looking for https://livewire.laravel.com/docs/wire-stream. The ability to append vs replace is a nice addition as well.
I would bet a good deal of this is around streaming back AI based content. While using SSE or webhooks is an option it adds complexity to infra depending on the surrounding infra. The chunked transfer encoding feels clean because once all the data is sent the connection is closed vs SSE needing to be replaced to be closed with no real "polite close" or having to deal with websockets in general.
The other simplicity comes from not having to deal with channels for SSE or websockets on the server for multiple client, when you want to send back to only the sender the websocket / sse solutions feel heavyweight.
I have an ES5 version of an extension that supports the chunked encoding that I've been using internally at the company I work for.
https://github.com/JEBailey/htmx/blob/master/src/ext/chunked.js
So I originally closed #2101 because 2.0 was coming up and we weren't going to make that happen in time. I'm seeing some compelling use-cases and @douglasduteil's extension looks like it's been working. Are people using it? What's the case for including this in core?
one usecase would be a fairly simple chatbot app which supports streaming.
no need for websockets, no need for server-sent events, no need for keeping a connection to the server.
you could simply make a request to the server, the server sends a Transfer-Encoding: chunked
response, which will then incrementally be swapped/added into the respective chat bubble.
So I originally closed #2101 because 2.0 was coming up and we weren't going to make that happen in time. I'm seeing some compelling use-cases and @douglasduteil's extension looks like it's been working. Are people using it? What's the case for including this in core?
Hi @alexpetros,
Yes, I'm using @douglasduteil's extension in https://github.com/runeksvendsen/haskell-function-graph and it's working for me. Thank you @douglasduteil!
The case for including it in core is that it solves a very generic problem: you don't want the user to wait for the very last part of your page to be received until the first part of the page is shown. The larger the time difference between the backend having the first and last result available the worse this problem is.
In my case, I have a page that includes in the following order: (1) a list of results, where the first result is usually available to the backend very quickly (within a few milliseconds) and the last result can take an additional ~second to become available; followed by (2) an SVG graph that's slow to generate, because it calls out to a CLI executable. Without this extension, the user has to wait around ~two seconds to see the first results, even though they're available to the backend (and sent to the client) almost immediately.
@alexpetros a lot of our users are asking for chunked transfer support to make it easier to stream responses from language models, and similar tasks.
@jph00 have you tried the extension? I would like to get some feedback on how the extension is working for people
Could someone please explain how this would be useful as opposed to using the existing SSE extension? I'm surely missing something fundamental about how they work
Could someone please explain how this would be useful as opposed to using the existing SSE extension? I'm surely missing something fundamental about how they work
I think basically the benefit is that it's simpler.
Incidentally I have a couple of pages where the user can (potentially) load a few thousand rows' worth of data on-screen.
How can I setup the extension such that it, sorta, stream like 50 rows at a time?
I'm using the extension and it works well for my use case there's a few points that break like trying to use templates with declarative shadowdom doesn't seem to render the initial components https://templ.guide/server-side-rendering/streaming. Though that may have to do with how swapping works read somewhere that it doesn't work with innerhtml call but may be wrong.
Having something official just means one less extension, potentially more hooks, and less of a chance of breaking with future updates. Am also using it with LLM streaming like others!
Could someone please explain how this would be useful as opposed to using the existing SSE extension? I'm surely missing something fundamental about how they work
I think basically the benefit is that it's simpler.
Also:
Could someone please explain how this would be useful as opposed to using the existing SSE extension? I'm surely missing something fundamental about how they work
I think basically the benefit is that it's simpler.
Also:
- Much better backwards compatibility (only requires HTTP/1.1)
- Doesn't require JavaScript
- Isn't broken for HTTP/1.1 (see warning at https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events/Using_server-sent_events)
Thanks! Can you do all the same things with chunked transfer encoding as you can with sse? Specifically,
Could someone please explain how this would be useful as opposed to using the existing SSE extension? I'm surely missing something fundamental about how they work
I think basically the benefit is that it's simpler.
Also:
- Much better backwards compatibility (only requires HTTP/1.1)
- Doesn't require JavaScript
- Isn't broken for HTTP/1.1 (see warning at https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events/Using_server-sent_events)
Thanks! Can you do all the same things with chunked transfer encoding as you can with sse? Specifically,
- keep a long-lived connection open and periodically send messages to the browser?
- allow for JavaScript (eg htmx or any other script, service worker etc) to initiate and receive/process the messages?
I don't believe that you can keep it open for a long time (or at least it's not the norm).
Connections are initiated with a standard PUT / POST / GET etc... which is really nice vs SSEs get-only setup. For LLMs specifically it's almost always a non-get request.
Could someone please explain how this would be useful as opposed to using the existing SSE extension? I'm surely missing something fundamental about how they work
I think basically the benefit is that it's simpler.
Also:
- Much better backwards compatibility (only requires HTTP/1.1)
- Doesn't require JavaScript
- Isn't broken for HTTP/1.1 (see warning at https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events/Using_server-sent_events)
Thanks! Can you do all the same things with chunked transfer encoding as you can with sse? Specifically,
- keep a long-lived connection open and periodically send messages to the browser?
- allow for JavaScript (eg htmx or any other script, service worker etc) to initiate and receive/process the messages?
I don't believe that you can keep it open for a long time (or at least it's not the norm).
Connections are initiated with a standard PUT / POST / GET etc... which is really nice vs SSEs get-only setup. For LLMs specifically it's almost always a non-get request.
You can initiate SSE with POST etc... Here's two libraries that make it easy
https://github.com/rexxars/eventsource-client https://github.com/Azure/fetch-event-source
Thanks! Can you do all the same things with chunked transfer encoding as you can with sse?
No, you definitely cannot. Chunked transfer encoding is just a way to transfer data from a server to a client (and not the other way around).
That's why it doesn't require JavaScript and supports ancient browsers.
That's why it doesn't require JavaScript and supports ancient browsers.
which is exactly why it should be supported by default 🙏
But htmx IS JavaScript... And given that some have confirmed that SSE is a more robust and functional protocol than chunked transfers, what's the point of this request?
The only thing I can perhaps think of is that chunked transfers can return binary streams whereas SSE is only text. But no one has brought that up as a use case.
And it's not like sse is complicated to set up - it, too, is part of the http protocol so you literally just use a different header.
I'm just very perplexed, but quite open to being educated on what I'm missing about why this is useful, let alone needed.
sse - just as websockets - establishes long lived connections to the server.
let's take the example of a simple chatbot with a backend deployed on a serverless platform (e.g. aws lambda) where the backend runs just a couple of millseconds at a time (up to a couple of minutes at most).
if we want to have a streaming response, the simplest approach would be to use chunked transfer of the content. this would mean not needing any persistent connection (websockets or sse) between the client and server.
e.g. it would be a simple POST request with the message from the client side and a chunked transfer of the response from the server.
Thanks - what I was missing was that the use case is not long-lived, server push. It's just looking for a way to send a streamed response to a user-initiated request, and then close the connection when done. The streaming part of this is what had me focusing on the sse/ws stuff, but it's more a matter of push vs pull.
yes! and also reducing complexity, when the only thing you want is a streaming response. sse and websocket can be quite intimidating at first i think.
But htmx IS JavaScript... And given that some have confirmed that SSE is a more robust and functional protocol than chunked transfers, what's the point of this request?
[...]
I'm just very perplexed, but quite open to being educated on what I'm missing about why this is useful, let alone needed.
The main reason I think it should be included in HTMX core is that not including it breaks Progressive Enhancement for chunked transfer encoding.
As I explain in my duplicate issue (https://github.com/bigskysoftware/htmx/issues/2789), the degree of brokenness varies depending on how long it takes the server to close the connection, but in the worst case it's a difference between a boosted link never displaying any content if HTMX is enabled while content is displayed immediately in case it's disabled.
I found another reason, someone said you can achieve the same feature with SSE however SSE is meant to live all the time the component is rendered.
In a classic search example where you want to send results to the browser as they are found (for user experience and to avoid memory pressure in the backend) using SSE causes the browser to reconnect once the last result is found (as the channel is closed on the server) making it show like "infinite results" coming when in reality is just the same query executed again and again.
I found another HTMX extension that is similar to chunked transfer encoding: https://github.com/alarbada/htmx-stream. Thanks to @alarbada, it's very simple to display progressively generated content over a single GET or POST request. I would really love it if this HTTP/1.1 functionality were supported by HTMX by default.
Hello @wjkoh. My very much alpha project works somewhat similar to SSE, but I did have the difficulties that @carlos-verdes said. In a way it is much easier to use, as you don't need to keep any SSE connection.
You could think of it as react server components, new html is streamed from the server. I did find that replacing the whole thing instead of appending chunks is easier to think about from the server.
I did not further continue with that project as I'm not using htmx anymore, if anybody interested feel free to fork.
Hi, I have a server that is responding with a http chunked response to load data as it's available (also it reduces the consumption of memory on the server): https://en.wikipedia.org/wiki/Chunked_transfer_encoding
When I make a direct call with my browser I can see how the results pop-up as they are flushed from the server (as expected) however if I use
hx-get
I don't see anything in the screen until the full response is sent (not the expected behavior)I see there is support for SSE and Web Sockets, is there a plan to support this feature also?