quicwg / base-drafts

Internet-Drafts that make up the base QUIC specification
https://quicwg.org
1.63k stars 204 forks source link

Possible HoL blocking due to co-mingling payload and metadata (header) address space. #1606

Closed grmocg closed 5 years ago

grmocg commented 6 years ago

In the 'partially reliable http' use-case, e.g. an interactive (VC-like) 'video protocol', the co-mingling of the payload and metadata (header) address space on the QUIC stream prevent interpretation of payload data until all metadata has been received in cases which attempt to use the address space to assert framing (e.g. I know that the data at any location where n % k == 0 is the start of a record).

This method of framing is unavailable (until all headers have been received) with QUIC as it is today since the size of the metadata the client will receive is unknown (it might be encoded by a different compressor, e.g. via a reverse proxy) to either the client or sender, and since the address space of metadata and payload is co-mingled.

In other words, in cases where the client knows the relationship of the requests/responses (for instance with a server push response to a particular request) and so doesn't need the headers immediately, it is advantageous to have an address space for payload that starts with a known, constant value.

Consider the following:

  1. A client requests video of a particular codec, size, bitrate, etc.
  2. The server replies by sending a number of server push streams in response to that request., likely within a 'stream group' should such a thing come into existence in QUICv2.
  3. For streams which arrive after the first server push'd stream, the type of the data is , and can be interpreted by the application immediately, even when received out-of-order and/or non-contiguously.

It would be expected in such a use-case to have a large number of server push'd 'responses' referring to byte-ranges of files on a server/proxy/cache. The client would likely establish a few requests to which the server would never directly respond (it would instead server push responses to those 'requests'), e.g. the client would request playback of a 5MB/s 264 bitstream, and the server would push byteranges of a file containing (at least) the elementary stream.

Note that the headers on the pushed streams will be useful upon receipt for applications running on a framework (e.g. a browser) which may wish to cache/store and which (likely for security reasons) doesn't offer the application the ability to assert the data should be stored with URL X/ETAG Y.

LPardue commented 6 years ago

Would it work to definine a new server-initiated stream type on which you send a PUSH_ID stream header and then simply the corresponding DATA frames?

grmocg commented 6 years ago

Sure, that is one potential way to make the address spaces separate (a header stream and an associated push stream).

Note that the intent is to still be HTTP.

ianswett commented 6 years ago

Q: What's the benefit of HTTP here? It seems like you might want RTP and/or WebRTC over QUIC?

martinthomson commented 6 years ago

I'm inclined to suggest that this is something that could be addressed in an extension. Imagine that both peers negotiate a new interaction mode whereby request and response bodies are carried on other streams. (For @LPardue: that new stream couldn't use DATA frames though, or you lose random access, not knowing where the frame headers are inserted.) Establishing which stream maps to which request is tricky, but probably able to be resolved in a way that meets the requirements. I suspect that you can't avoid having at least some metadata, but it might be possible to use a signaling mechanism for that that can be isolated from the worst of the HOLB.

The OP suggests that metadata might not be important, which suggests that maybe this isn't as much HTTP as is claimed. In some ways, the more useful this protocol is for this sort of use case, the less it resembles HTTP. We should explore that some more.

I realize that there are benefits to this being the standardized approach for HTTP, but we'd need to resolve the question of similarity first. After that, we need to recognize that the structure of the current design is intentional and addresses real issues (though for some, that depends on how much you believe server push to be relevant).

grmocg commented 6 years ago

Http has explicit control for caching, scales to millions, and is already in use by the application (e.g. for getting the list of videos, thumbnails, etc), which would otherwise have to deal with contention between multiple connections (something which decreases the accuracy of the bandwidth estimate, and would cause rebuffering if not controller for).

Http offers a mechanism (via byterange requests on named resources) to 'heal' uploads/broadcasts if there should be an interruption, reconnection, server restart,

Http offers all of the semantics we want except partial reliability, which is what this is about!

On Wed, Jul 25, 2018, 6:33 PM ianswett notifications@github.com wrote:

Q: What's the benefit of HTTP here? It seems like you might want RTP and/or WebRTC over QUIC?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/quicwg/base-drafts/issues/1606#issuecomment-407947737, or mute the thread https://github.com/notifications/unsubscribe-auth/AA9dcqPcNXi-fGrADGIXqPjlMsVx4n10ks5uKRxugaJpZM4VgyRs .

LPardue commented 6 years ago

@martinthomson thanks for the explanation feedback. Since it took me time to digest what you said let me echo back my interpretation: for a more loosely coupled interaction mode, the use of DATA frames prevents random access to the QUIC stream because there'll be HTTP/QUIC frame headers at unknown locations.

One approach, therefore, might be to carry bodies directly in STREAM frames, which starts to sound like GQUICs approach. The difference being that we might avoid use of stream ID's directly and use some other token to correlate headers and bodies.

In a partial reliability world, with the above design lost STREAM frames are ok. They result in HTTP body gaps, that can be skipped or repaired (say by byte-range request as @grmocg suggests). We implement a design like this today in our multicast QUIC prototype.

grmocg commented 6 years ago

@martinthomson

HTTP's headers are important for caching, but may not be important for the initial streaming. This is still interesting for broadcasts-to-millions, since any hiccup on ingestion of the stream causes a hiccup for any follower/viewer of the stream.

I'm assuming we'd likely also be using a stream-group (i.e. a grouping of a bunch of HTTP requests/responses together). Within that stream-group (which is really better discussed in that issue), we can know that we're sending multiple responses for a particular request (for the playback side). Depending on the eventual interaction with stream-group and headers, a great many assumptions may be possible for the requesting client.

Any other client would not be able (unless there is some non-HTTP logic on the server) to make such similar assumptions, and would need more data in headers.

There are certainly a few ways to play with this: 1) assume that the headers are below offset X. Start payload at offset X. 2) Transmit a new frame-type which specifies the offset of the headers data until the receiver ack's t has received it. 3) Use different streams for each (this interacts in fun ways with flow-control/max streams). If you assume one of the bits was used to indicate metadata vs payload , that could be sufficient. ... and more.

krasic commented 6 years ago

@grmocg out of curiosity, is there an issue about stream-groups? ( I don't see it ).

grmocg commented 6 years ago

Yup, though I called it something more generic under the assumption that 'stream group' may not be the only way to solve the issue:

https://github.com/quicwg/base-drafts/issues/1073

krasic commented 6 years ago

To the questions about HTTP applicability, an example might be recent work on low latency DASH with cmaf. For example, https://www.slideshare.net/Akamaidev/the-road-to-ultra-low-latency, esp. slide 9. Existing variations use HTTP 1.1 chunked transfer, with http chunks aligned to the fmp4 fragments, improving latency over conventional live DASH. As I understand this thread, the idea is to improve further, reducing encoder/transcoder/origin latency closer/to the minimum (i.e. one media frame).

MikeBishop commented 6 years ago

The design from one of the unidirectional forks, where DATA frames described on which other stream to expect that chunk is actually no better for you, because you still need the rest of the header stream to get appropriate correlation information. Basically, for any design which permits the body not to be a single contiguous unit on its own stream, you'll have this problem.

As @martinthomson says, the current design also addresses real issues, and I don't know that we want to rework our stream mapping again at this point. We decided in Paris to go to a single stream because we wanted to be able to provide ordering amongst metadata and body chunks.

LPardue commented 6 years ago

I think the current stream design is good for the conventional HTTP case. I think that more specialised HTTP applications, where the resources are more self-describing can benefit from alternate designs. To paraphrase some of the above discussion, a request for a DASH initialisation segment can ellicit a bunch of PUSH_PROMISEs for media segments (easily generated because they are predictable, so minimal HOL blocking). Soon after, media segment body data is delivered on a new type of HTTP/QUIC server-initiated unidirectional stream that delivers segment body on STREAM frames. Some time later, response headers can be delivered on yet another new stream type, to be used for caching etc.

This allows the media segment data to be fed directly into a player pipeline closer to the rate of delivery.

MikeBishop commented 6 years ago

I'm marking this v2 for now, because I think it's out of scope for generic HTTP mapping. However, I think Lucas is correct that the better path forward would be as an extension that replaces/augments server push with body-only streams for these specialized scenarios.

grmocg commented 6 years ago

@LPardue Yup.

I think/hope that we can come up with a cheap and standard way of representing the end of the initial headers to enable these usecases. And example was to have a new frame type (at the 'http' layer) which allows us to state the offset of the metadata (for a few RTTs/until we know a packet containing it has been received).

I'm not worried about in-line metadata, since that is arguably 'payload' in a world where we can do stream-groups and multiple response streams (which would be arguably more flexible than a single response with multiple metadata segments for header-size-ish metadata).

@MikeBishop

In my estimation, this isn't just a 'server push' issue, as one would really want to do the same thing on ingestion (uploading), not just playback (downloading). It becomes an HTTP mapping issue (whether for quicv1 or quicv2) since it impacts not just push, but also GET/PUT, and should fit into whatever prioritization stuff is happening as well.

Consider that one of the bigger issues with video playback (and rebuffers), especially on browsers (but also on android/ios apps), is "application" self-contention. A page/application/domain makes requests for HTTP objects but, due to the lack of coordination (today) between the video uploading/downloading stuff (most often not HTTP today for low-latency stuff) and the non-video HTTP stuff, the bandwidth estimate becomes incorrect, the buffer drains prematurely, and you have the video transfer (which expects at least a certain bitrate over short time periods), much to user chagrin.

Anyway that is part of the thought process!