Open alficles opened 4 years ago
I think PUSH is fascinating. One thing that comes up for me is that PUSH is useful for sending content before it is requested, but not necessarily sending updates to content after it has been requested.
Most of what we are doing is the latter. We do this by holding open GET requests, and streaming new versions there.
But it would also be useful for a server to PUSH a subscription to a resource that it knows the client will want, before the client issues the request. We could totally push an entire GET+Subscribe response, which pushes a stream of updates like this, before requested.
I imagine we will want to standardize this later. It's been a pretty glaring hole in H2 PUSH that there aren't any semantics for when the server should push. This semantic needs standardization. I'd love for us to tackle it.
It seems to me that the underlying semantic people want is something like "give me all the other documents I'm going to want to request that are linked from this document." And I have noticed that this is a problem that GraphQL solves. I imagine we will want something similar to that, and have been sketching out some potential solutions myself.
My personal efforts have been prioritizing synchronization before this GraphQL-over-PUSH work, but I am a big fan of the concept.
(edited)
Another use for PUSH would be a bit radical— enabling a server to initiate a request, and not just a client.
This could make the web more P2P.
But when we do this, it actually becomes a lot more elegant to let go of the request/response model entirely. Because a server initiating a PUT is semantically equivalent to responding to a GET— both communicate "here's the new state of the resource".
This was a concept in the old draft toomim-braid-00 that we haven't found a home for in the new draft:
4.2. Generalized request/response
In HTTP, a client sends a *request* to the server, and that request is
met with a *response*. By contrast, a Braid connection is two-way, so
messages can be initiated by either party. Rather than giving a
response to a message, a Braid server sends a separate message that
acts as the response. It turns out that a GET response message has
the same effect on a peer as a SET request message-- both set the
state on the recipient.
--------------------------------------------------------------
| HTTP | Braid | Meaning |
| ------------ | ----------- | ----------------------------- |
| Get Request | Get message | "I want this" |
| Get Response | Set message | "This is the current version" |
| Put Request | Set message | "This is the current version" |
| Put Response | Ack message | "I accept this version" |
--------------------------------------------------------------
So, PUSH is a useful way for servers to answer questions clients didn't know they had yet. Right now, that's mostly the answer to the question "what's at this resource" when you just handed them a link to the resource that they haven't had a chance to process.
Here, though, the question is "what is the update for this resource" when the client can't have known it was even updated.
As a practical matter, I really like PUSH here, because the processing is the same whether it's pushed or comes as a response to a request. You receive the object, determine whether it's a normal object (store in the normal way, typically by caching it somewhere) or an update (update the object in the normal way).
It also allows you to use PUSH optimistically, as long as the client hasn't disabled PUSH. If you've got an update and the connection isn't dead yet, go ahead and send it. If the client doesn't understand the content-type, it will be ignored.
Ok, I appreciate these points. You've made clear that we could send updates over PUSH instead of over a get stream, and that there's a possibility that this could simplify the code for H2 clients. However, since PUSH doesn't work for H1, we'd need two different subscription mechanisms in the protocol— one for H1, and another for H2, and I believe that could end up requiring more code in the end.
The current protocol works for both H1 and H2. A client can be implemented with the same Javascript; with one minor caveat that might go away in the next draft.
I think the way to evaluate this is to try implementing both versions of the protocol and then evaluate the code. We should have a reference client (without PUSH) passing fuzz tests within the next couple of months. We could then try a PUSH version and see which one we like better. Does this sound right to you?
You're going to need two different mechanisms for H1 and H2+ anyway. H2+ does not have any way to handle per-hop behaviour via headers. As it stands, an H2 proxy will be incapable of behaving properly with existing proposal. This is the issue being discussed in #62.
I'm sorry, I think I closed this issue prematurely. Re-opening.
OK, just to be clear, HTTP2 PUSH does not allow server to send data to the client for which the client is not aware of (e.g., a message of type "hey, there was an update to document X"). HTTP2 PUSH only allows server to pre-populate special client's cache with data for which then the client makes a request for and magically data is already there. It is much less useful as it looks.
Probably SSE or Websockets is what we really want here.
This was well stated:
HTTP2 PUSH only allows server to pre-populate special client's cache with data for which then the client makes a request for and magically data is already there.
^^ This is a use-case we want, though. I run into this need all the time, with real apps I'm building on Statebus.
Imagine I'm fetching a feed of messages represented with linked json:
/feed:
[
{ link: '/message/1' },
{ link: '/message/2' },
...
]
/message/1:
{
author: '/user/mike',
message: 'Hi guys!,
date: 1649719249
}
...
The client will want to fetch the list of messages, and then each message within, but it won't know the URLs of each inner message until it fetches the outer feed. This is a perfect use-case for HTTP2 Push!
In fact, I've been hoping to propose a new type of Range Request to let us traverse Linked JSON like a Graph, and achieve the equivalent of a GraphQL query using regular HTTP headers + HTTP2 push. The client's request could look like this:
GET /feed
Subscribe: true
Range: linked-json=[**]
This Range:
header uses a hypothetical new Range Unit called linked-json
, where [*]
means "give me all items in the array", and [**]
means "give me all items in the array, and descend recursively across links to give me everything within the links.
Now, I'm not claiming that this is actually the right syntax. We might just want to re-use GraphQL syntax. I'm only illustrating this to make concrete the situation in which HTTP2 push becomes very useful for Braid apps, by reducing the round-trips required for a client to fetch nested linked data.
But to be clear, Mitar's main point is spot-on: HTTP2 push is not a good way to send updates to subscriptions. We can do that more directly on top of HTTP2 frames themselves.
The client will want to fetch the list of messages, and then each message within, but it won't know the URLs of each inner message until it fetches the outer feed. This is a perfect use-case for HTTP2 Push!
It is! This is also what awesome Vulcain does. This is why I am also in favor that subscriptions themselves are just messages saying "document with ID changed from version Y to version X" without any other payload, and then you do another HTTP request to get patch between Y and X. But because subscription already pushed that patch, you would already have it. But that patch request can then be cached easily. (While if it is a frame inside WebSocket for example, you have a problem.)
The issue is though that Chrome is working on removing PUSH support so then this approach would not work (and alternatives they are proposing instead do not support such use case).
I've been hoping to propose a new type of Range Request to let us traverse Linked JSON like a Graph, and achieve the equivalent of a GraphQL query using regular HTTP headers + HTTP2 push.
Vulcain already defines a set of headers for this. Maybe useful as inspiration or to adopt them as-is.
We can do that more directly on top of HTTP2 frames themselves.
Hm, but why not just use WebSockets over HTTP2 at that point? https://datatracker.ietf.org/doc/html/rfc8441
This would be a pretty big update to the document, perhaps most appropriate as another document entirely. But I think it would be useful for servers to be able to provide object updates via PUSH on an open H/2+ connection. H/1 doesn't have such a concept, so I see why a new protocol is needed there, but H/2+ has all the primitives we need to handle the update propigation.