httpwg / http-extensions

HTTP Extensions in progress
https://httpwg.org/http-extensions/
434 stars 143 forks source link

Need a way to tell if server supports priority #1274

Closed guoye-zhang closed 3 years ago

guoye-zhang commented 4 years ago

Extensible priority scheme is an extension which server might not choose to implement. However, Low-Latency HTTP Live Streaming depends on current HTTP/2 dependencies and weights, and if it is to switch to the new scheme, client needs to tell if the server supports priority. In case priority isn't supported, client would disable low-latency features and fallback to regular HLS.

LPardue commented 4 years ago

I believe this is already covered under section 2.1 https://tools.ietf.org/html/draft-ietf-httpbis-priority-01#section-2. An HTTP/2 server that implement the extension MUST send the setting.

guoye-zhang commented 4 years ago

@LPardue This is not exactly about detecting old or new priority scheme in HTTP/2. We also want to know if an HTTP/3 server supports priority. And when HTTP/2 dependency is removed from the next version of RFC, there could HTTP/2 servers that don't support either scheme.

LPardue commented 4 years ago

OK, it sounds like a slightly different requirement.

An explicit "I support extensible priorities" signal would likely want to future proof itself for some other scheme or signal. This is a slippery slope of complexity. I have an unpushed branch for draft-lassey-priority-setting that explored advertising multiple priority schemes and agreeing on one. This got gnarly pretty quick, and seemed to me way more complicated than needed for the problem the design team set out to solve.

Looking at LLHLS, it says

Efficient delivery requires HTTP/2 priority control (dependencies and weights)

This requirement is IMO at odds with the premise of priorities, even the H2 scheme we have today. RFC 7540 says:

An endpoint cannot force a peer to process concurrent streams in a particular order using priority.

Given that LLHLS seems to be a specialisation of HTTP/2, I wonder if it would be better served by an ability for a server to advertise if it supports LLHLS or not, rather than building up the picture from protocol foundations or extensions.

LPardue commented 4 years ago

And while exploring this tangent, I wonder how JavaScript-based HAS players using Fetch or XHR would approach connection-based HTTP feature detection. Maybe @wilaw could share some insight about how this can or can't be done.

roger-on-github commented 4 years ago

@LPardue the Apple LL-HLS client will fall back to regular latency if the server does not support a required set of functionality because the algorithms will behave improperly otherwise. This includes request multiplexing within a single connection, the ability to strictly order pipelined requests for a single flavor of media , and the ability to prioritize simultaneous requests for certain flavors (bit rates, metadata) above others.

This does not require a negotiation of prioritization schemes, the direction it seems you went in your "slippery slope of complexity" tangent. It only requires that the service positively indicate the current generation of scheme that it supports.

guoye-zhang commented 4 years ago

One of the goals of HLS is that it can be served by any HTTP server and takes advantage of existing CDN infrastructure, so we want to avoid a dedicated signal for LLHLS if possible.

LPardue commented 4 years ago

The use case is quite focused. The general the problem I see is that if you have both endpoints stating they support future multiple schemes, how do the endpoints agree to on which scheme to use? If you don't agree on one, then you'll end up having to send them all, and they might not even be compatible. That's not a very nice situation.

One problem today is all HTTP/2 servers support the tree-based scheme. But they might not respect the prioritization requirements that you have. If we had a means to state extensible priority support, I wouldn't be surprised if some servers also didn't live up to the requirements. That's why I question the value of such a signal.

guoye-zhang commented 4 years ago

The hope is that this priority scheme is general enough for LLHLS use case. Inventing another prioritization scheme which achieves largely the same thing would not be ideal. Nor if server is forced to interpret this priority scheme in a more specific way to support LLHLS.

I wonder if it's possible to add feature detection for HTTP extensions in general. It would also be useful for, for example, resumable upload support.

LPardue commented 4 years ago

The hope is that this priority scheme is general enough for LLHLS use case. Inventing another prioritization scheme which achieves largely the same thing would not be ideal. Nor if server is forced to interpret this priority scheme in a more specific way to support LLHLS.

Then I'm confused. The issue appears to state that a client needs to know if the server provides "strictly order pipelined requests for a single flavor of media , and the ability to prioritize simultaneous requests for certain flavors (bit rates, metadata) above others.". Since extensible priorities does not mandate server behaviour, advertising it's being used isn't a strong enough contract for you.

kazuho commented 4 years ago

I can see the desire to use different prioritization signals based on what the server claims to do. But I think I agree with @LPardue that, in general, it is dangerous to assume that the two properties are equal: if a server implements a prioritization scheme, and if a client can exploit certain capabilities of that prioritization scheme.

To give an example, consider the case where a H2 terminator sits in front of a H1 server, the number of concurrent requests from the H2 terminator to the H1 server is limited to 5 per each H2 connection, the stream-level concurrency of the H2 connection would be 100. This as a reasonable configuration; it provides enough concurrency for the H2 hop to hide the latency, avoids excessive use of the application server (doing H1). But a client cannot utilize the capability of receiving 100 incremental responses concurrently.

roger-on-github commented 4 years ago

I'm confused about your apparent desire to not signal prioritization capability. How is a client to use any sort of prioritization scheme if the supported scheme(s) are not signaled?

LPardue commented 4 years ago

Sending the signals causes no interoperability problems if the target server does not support the prioritization scheme. This is part of the strength of extensible priorities, that it can be extended without the requirement of upfront coordination. Adding a signal for supporting extensible priorities could soon degrade to needing to enumerate all supported priority parameters.

wilaw commented 4 years ago

Quoting from this source and repeating a point made by Lucas earlier about RFC 7540 "Stream dependencies and weights express a transport preference, not a requirement, and as such do not guarantee a particular processing or transmission order. That is, the client cannot force the server to process the stream in a particular order using stream prioritization. While this may seem counterintuitive, it is in fact the desired behavior. We do not want to block the server from making progress on a lower priority resource if a higher priority resource is blocked.".

So it would seem that any LL-HLS client cannot depend upon prioritization taking place in a H2 conformant server. The client may ask for prioritization but it will not receive an explicit signal that this has been implemented by the server. It will receive an implicit signal however, which is the turn-around time and throughput it receives on its playlist and part requests. Its algorithm should be robust at interpreting and handling these signals. The performance a client receives over a poor last mile network talking to a server implementing prioritization may well be indistinguishable from the performance it receives from a good last mile network talking to a non-prioritizing server. It needs to work under either scenario. It may need to back off on its target latency based on these signals for example.

roger-on-github commented 4 years ago

@LPardue so your recommendation to clients is that they just start playing whack-a-mole with the server, sending it a variety of prioritization signals, until they find one that it responds to?

LPardue commented 4 years ago

Since there is no way to validate that a server behaves in any particular way that a client wants, then they are already playing whack-a-mole. HTTP/2 prioritization signal parsing is mandatory to implement but acting on that is not. And that leads to varying quality of implementation; see https://github.com/andydavies/http2-prioritization-issues where a significant portion of tested services report failure for a certain type of test.

Furthermore, servers can take any signals into account when making prioritization decisions. That could be signals outside this scheme, which may interfere with your prioritization requirements.

Would you really be happy with a setting that says "I support parsing the Priority header"? We don't tend to have such signals for other HTTP headers, it seems like an anti-pattern.

If a LL-HLS client wants a server to assert that it support LL-HLS, it seems more natural to have a setting that advertises just that. Then you can apply more stringent prioritization/ multiplexing requirements and performance goals on that server.

roger-on-github commented 4 years ago

To use an optional feature you need a signal that says "I support a particular set of semantics." You have to consider it from the POV of a client who wants to use the feature: CLIENT: Hey server, there's this feature I want to use. Do you support it? SERVER: Well, I support its syntax CLIENT: So you support it? SERVER: Maybe! Your feature uses it and was defined in 2020. A couple other features use the same syntax, one was defined in 2021 and another in 2023 CLIENT: Hey, the 2023 feature uses the same bit to do something different that would kill my performance! SERVER: Oooh, you're right CLIENT: So can I burn a few RTTs with a test to see how you treat that bit? SERVER: Possibly. But your feature is best-effort so you'll need to do it a bunch of times to get confidence CLIENT: I guess I can't use this feature after all. Who designed this, anyway?

Let's look at IP protocols with optional features that have been well-adopted. TCP and TLS both do optional features right: they provide positive signals of support. They have achieved broad adoption with a minimum of drama.

Then there are counter-examples of features that are unsignaled and implicit. HTTP 1.1 pipelining and Range requests come to mind. They are both disastrous (for their intended use) when used against servers that don't support them, and their adoption has suffered as a result. It was bad protocol design then and it's bad protocol design now.

LPardue commented 4 years ago

I think this is a strong case of you ain't gonna need it. I don't anticipate someone defining another prioritization scheme. But if they do, and we don't know what it is, then designing a negotiation mechanism to help select different schemes now has a risk of failure.

As I said earlier, extensible priorities is an extension that itself is extensible. If you want absolute cast-iron guarantees from endpoints about what they do, then you have to run the task to completion.

I'm not familiar with other HTTP extensions that have this upfront server advert requirement. If there are then I'd appreciate references to them. Otherwise their absence is telling that HTTP feature detection is a problem that is not solved trivially.

Likewise, if you have proposals that would satisfy your request them please write them up and share them as a PR. Because I suspect people are finding it hard to talk around a solution when little is written down.

reschke commented 4 years ago

@roger-on-github

Then there are counter-examples of features that are unsignaled and implicit. HTTP 1.1 pipelining and Range requests come to mind.

Pipelining is not an optional feature. Yes, there's breakage, but it's not an example of extensibility gone wrong.

Range requests/responses IMHO have all the signalling that is needed, so it would be nice if you could be more specific.

roger-on-github commented 4 years ago

@reschke a client that issues a Range request for the middle 140 bytes of a 3GB file to a server that does not handle Range requests will receive a 200 and 3GB of data. For a client that was trying to use Range requests to limit itself to an operationally required amount of data, getting 3GB instead is a terrible outcome.

roger-on-github commented 4 years ago

It's not a strong case of you ain't gonna need it, it's more like a strong case of "nobody's going to use it because the outcome is undefined and might actually be worse than using it." Not anticipating someone defining another prioritization scheme is a poor excuse not to plan for it.

At this point I've moved beyond the specifics of LL-HLS, to criticizing the overall plan to provide optional, extensible, unsignaled prioritization capability in H3 (or H2), on behalf of any client that might want to use it for anything.

At the very least, to make it adoptable and future-proof you would need to do the following:

LPardue commented 4 years ago

Not anticipating someone defining another prioritization scheme is a poor excuse not to plan for it.

I did the thought experiment and wrote a candidate design. I didn't like the complexity this added to the protocol. Please present a design that you believe will address your requirements and the WG can assess it.

Restrict the request syntax in a way that is uniquely bound to a particular capability set so that a client making the request is assured that it cannot be interpreted as any other semantic. Maybe by reserving a frame type for it, maybe with an IANA-managed capability selector and a version number, or something else.

The syntax is a Structured Headers dictionary, carried in a header or a frame. We've had the headers vs frames debate in the WG, the outcome was to provide both. Revisiting this would need WG chair intervention.

Provide a positive indicator if the responding server could not support the requested capability. In the case of a prioritization request to a server that does not support that prioritization method, for example, this might take the form of a mandatory "Prioritization-Exception" HTTP response header that indicated that the server does not support that method. That would allow the client to react appropriately, such as by changing its request pattern or by trying a different prioritization method.

This is cannot be done with Structured Headers dictionary, see https://tools.ietf.org/html/draft-ietf-httpbis-header-structure-17#section-3.2

Typically, a field specification will define the semantics of Dictionaries by specifying the allowed type(s) for individual member names, as well as whether their presence is required or optional. Recipients MUST ignore names that are undefined or unknown, unless the field's specification specifically disallows them.

The proposal here requires design changes to the protocol. There'd need to be WG support for making such changes.

reschke commented 4 years ago

@roger-on-github

...it could do a HEAD request first and check for Accept-Ranges.

kazuho commented 4 years ago

@LPardue

Not anticipating someone defining another prioritization scheme is a poor excuse not to plan for it.

I did the thought experiment and wrote a candidate design. I didn't like the complexity this added to the protocol.

I think that there is a distinction between having a signal and having a plan.

While it is true that we do not have a signal to indicate support for extensible priorities, it does not mean that the new scheme is unextensible or replaceable. If there becomes a necessity, people can define new parameters to extend the scheme, or define new set of header fields / frames that would supersede the extensible priorities.

Regarding if it would be a good idea to define a signal indicating support for extensible priorities now, under the premise that it would be used by the client to change it's behavior, I would reiterate what others (including me) have pointed out. Don't do that, because recognizing the prioritization signal does not mean that the server would (can) obey to those signals sent by the client.

The hard lesson we learned from HTTP/2 is that many servers do not (correctly) implement the prioritization scheme. We have seen poor performance due to that. That's why we are introducing a new scheme that is conservative, that would not lead to unnecessarily bad performance even when there are issues within the endpoints.

mnot commented 3 years ago

@roger-on-github have you considered putting a signal inside the manifest?

roger-on-github commented 3 years ago

Responding collectively:

I agree with @LPardue that it's difficult to have this discussion without a concrete proposal. So I've created a PR with a proposed design here: https://github.com/httpwg/http-extensions/pull/1283

To address the concern raised by @kazuho , my proposal does not require a promise of priority support, only an acknowledgment that a requested prioritization was applied to a particular response.

@mnot : asking intermediaries to change their delivery behavior based on deep inspection of the response is not something I considered, no. Is that a recommended pattern for HTTP delivery these days? Do you know of any examples that have achieved wide adoption (i.e. by a broad range of CDNs)? It seems like a layer violation to me.

LPardue commented 3 years ago

Thanks for the proposal @roger-on-github .

In a nutshell it defines a new parameter ap that is mandatory and attests to the server having applied something.

The problem I have with that design it that, as an implementer of an HTTP/3 stack, I depend on the QUIC transport to manage bandwidth allocation. My server is dependent on both the behaviour of the local stack, the client's flow control (stream and connection) and retransmission logic. All of those things affect how the client could perceive bandwidth usage over some time period, and they are things that happen somewhere that is not exposed to my serving application at the time I emit the response header.

It is unfair to say that this server has not applied the priority signal. It will most likely do what the client asked for. If the client is going to measure bandwidth allocation and make informed decisions for future requests, I'm not sure the value of this parameter.

This new parameter also introduces a downside. Currently the response header is defined as A server uses it to inform the client that the priority was overwritten. So in the conventional case where a server follows the client signal, we would not need to generate a response header at all. The new parameter would require the header to be sent if the priority was applied.

roger-on-github commented 3 years ago

The applied parameter does not attest anything more than the application of the requested prioritization to the scheduling logic at the HTTP level. It is understood that in exceptional cases the vagaries of the underlying transport can change the effective outcome. That does not negate the value of the server scheduling, nor the client's decision to ask for it.

I agree that always adding the response header adds a minor cost. If the consensus here is that this cost significant, my proposal could be modified to add a second (request) parameter that requests this acknowledgment (or a SETTING that requests that behavior).

LPardue commented 3 years ago

I disagree that this is an exceptional case. I delegate stream scheduling to the QUIC transport layer implementation, which abstracts how it works. I will feed the transport as fast as I can. I don't know what claim my HTTP layer can make.

roger-on-github commented 3 years ago

How does your implementation behave when two files are requested, both are immediately available, and a priority of incremental=0 is requested for both?

LPardue commented 3 years ago

It's worth highlighting that H3 differs from H2 in that stream-based payload is subject to flow control and retransmission. And so to keep the transport simple, all stream contents are equally prioritised. H2 typically focused on DATA frames, which correlate to response payload only.

So taking 2 requests with urgency=3 and incremental=0. The transport offers a larger urgency space for non-response streams. Extensible priority urgency fits in the middle of that space.

For the given example, the transport will:

Always prioritise trying to send on H3 control streams. Then attempt to send response data on streams in the order they were generated.

The H3 layer can only write as much data to a stream as there is flow control. So if, for example, the client stopped reading the response on stream 0, the stream would become application blocked. The transport would then move onto stream 4 in order to keep the pipe occupied.

Even if the client wasn't behaving weird. It is possible that packets that carry Window updates from client to server get lost. And so the server is quite dependent on its view of window size at any instant.

LPardue commented 3 years ago

The proposed ap parameter can also only apply to the mandatory parameters (because unknown extensions are ignored). So its usefulness is quite restricted.

But I'm also concerned about how it would interplay with extensions. For instance, I have a candidate, non-published, response extension parameter that would allow refining how a server sends chunks. Do I tell the client I am not applying things precisely as asked nd risk it taking the wrong action? It seems I'd have an incentive to mask the details.

roger-on-github commented 3 years ago

Regarding H3: flow control is managed (or at least influenced) by the client. So it is possible that a sufficiently advanced client with a knowledge of the expected (approximate) size of the payloads could arrange things so that substantially all of the segment1 bytes were put onto the wire before any segment2 bytes, and (statistically) the client will receive all the segment1 bytes first.

(And in fact if that is not possible, then it is likely that H3 is not a suitable protocol for this kind of application, or any application where performance depends on strict delivery ordering.)

Regarding extensibility: would you prefer a design where the client simply requests explicit acknowledgement of the applied priority, whatever it ends up being? In other words, a request parameter that triggers a mandatory attachment of a priority header to the response.

LPardue commented 3 years ago

A H2 client is also in control of stream receive windows, so it could employ such a strategy too. Think of H2 like a meta-transport in this respect. You'd probably be best served by conservative initial stream windows (which are common across all streams in H2 and H3 ) and then aggressive window growing for the stream of upmost importance. Knowledge of the resource size could help but might not even be necessary. The tradeoff is that you block the other streams and potentially starve the pipe if the first request is not available for some reason. But I totally get that video streaming has different demands for resource loading compared to the web. HoL avoidance features might mess you up, while they really help other HTTP-poweted workloads.

In my experience, one problem we have is that H2 and H3 settings are made quite generically. A browser that tried to adopt the above strategies might suffer regressions. However, they can adopt different strategies for managing the pipe; for example being clever with request ordering and batching (e.g not making all requests concurrently).

kazuho commented 3 years ago

@roger-on-github Thank you for the concrete proposal.

@mnot : asking intermediaries to change their delivery behavior based on deep inspection of the response is not something I considered, no. Is that a recommended pattern for HTTP delivery these days? Do you know of any examples that have achieved wide adoption (i.e. by a broad range of CDNs)? It seems like a layer violation to me.

Would that concern evaporate if we use a dedicated URI (i.e. something like /.well-known/llhls; cf. [RFC 5785])(https://tools.ietf.org/html/rfc5785)) to communicate the server capabilities?

I am not fully sure if using an end-to-end signal (a HTTP response) is the best solution, as each hop might have different capabilities, but I think that using a dedicated resource is no worse than using a HTTP response header as proposed in #1283. In fact, as prioritization is about choosing what to send when multiple responses compete on one connection, having one signal per connection is better IMO.

LPardue commented 3 years ago

And/or you could put it in the DNS

roger-on-github commented 3 years ago

I don't think that using a well-known URI scales very well. Content providers often offer the same content to both old and new clients, via both HTTP 1.1 and H2 (and maybe some day H3) using the same directory structure but different protocol implementations. It's probably not reasonable to ask a CDN to support extensible priorities on all three, or to perform directory magic to insert a dummy path element against protocols that support a feature. And if you add more than one such feature the complexity scales geometrically. That sounds much worse than an HTTP response header to me, from the point of view of someone trying to implement the whole system.

There are a few problems with "one signal per connection." First, it's more complex, because that signal has to encapsulate the entire priority tree for all active requests. Second, it would require new terminology not currently defined. And third, it would need to be refreshed each time a new request became active or an old request finished to remain accurate.

I will write up an alternate design based on the earlier feedback from @LPardue and share it soon.

LPardue commented 3 years ago

I think there might be some cross talk between concepts of

  1. "does the server support the priority scheme and some optional extension parameters, and is willing to be subjected to some additional scrutiny by clients"

  2. Is the server doing exactly what the client asked it.

Concept 1) suits a .well-known or HTTPSVC still service, because it decorates the HTTP connection

Concept 2) requires a server to give up information about itself, with unclear ramifications. Active probing like this has problems. And a design that depends on HTTP request/response, seems to be incompatible with endpoints that wish to only implement frame-based prioritisation. So it causes a problem for this I-D that needs to support both.

kazuho commented 3 years ago

I think I like how @LPardue describes the difference between the two concepts.

And if the client is going to issue requests differently based on how server behaves, I think that it is better to provide server administrators the knob for opting in/out. Because, then, the server administrators can use that knob to choose the client policy that works best.

As @LPardue points out, it is hard to make a good decision based on if the server recognizes the priority request header field, because how the servers actually prioritize the responses depend on many many factors.

@roger-on-github I think how well a well-known URI would scale depends on how you would design that metadata. To give an example, if we think that the server administrator might opt in to using LLHLS only for HTTP/3, then the well-known URI could be defined to contain a JSON array of ALPNs for which LLHLS should be used.

roger-on-github commented 3 years ago

@kazuho the server administrators generally do not have enough knowledge of the client behavior (which itself tends to change over time and between clients) to choose a "best" policy.

And to be clear: I don't anticipate defining a specific signal for LL-HLS. That would be a classic error of defining something in terms of what it's for instead of how it must work. At most, if I were to add prioritization rules to the LL-HLS spec I expect that I would add a provision that "servers using H3 MUST support extensible priorities with the following (possibly non-standard) extensions..." Extensible priorities are apparently what we've got for H3, so my intention is to build on that.

roger-on-github commented 3 years ago

Okay, I've written up an alternative proposal (which we can call A2): https://github.com/httpwg/http-extensions/compare/master...roger-on-github:add-ack

Essentially it replaces the applied response parameter with an 'acknowledge' request parameter, which mandates a response header containing whatever prioritization was applied (or none).

It addresses the following concerns:

kazuho commented 3 years ago

@roger-on-github

the server administrators generally do not have enough knowledge of the client behavior (which itself tends to change over time and between clients) to choose a "best" policy.

I understand the sentiment, but I tend to think that it might be a more reliable signal for changing the request policy than using a signal indicating acknowledgement of a signal.

PS. Thank you for clarifying the intended use-case.

LPardue commented 3 years ago

I think that writing the requirement "MUST support priorities" is part of the problem. It clashes with the design principles of prioritisation and nature of HTTP (which we're stuck with good and bad).

To elaborate, adding a MUST requirement for a feature that is, by definition, optional for a server to implement is odd. It's made worse by the client having very limited ability to detect this condition is held true. A javascript client in a browser is not going to be aware of how requests map to connections, so any bandwidth allocation measurements are going to be tricky.

A different way to write this would be to say something like (forgive my lack of details): an llhls server MUST provide the A,B,C multiplexing and bandwidth allocation characteristics because X,Y,Z. A client that detects these characteristics are unmet MUST/MAY leave/fallback etc. It is RECOMMENDED that the Extensible priority scheme is used together with parameters Foo, Bar. H2 connections MAY use the tree-based scheme. An HTTP server that is aware it is llhls enabled but unable to satisfy these conditions for any reason can respond with a header field "llhls: sorry".

I think that makes it clear for a server operator what the expectations are, and they can make the judgement call about whether their stack can be molded to fit them.

roger-on-github commented 3 years ago

@LPardue I get the limitations around MUST. I really do. it might be a good idea to define some kind of means test, with a hypothetical server model and perhaps an open-source test suite that can be run when qualifying or debugging the server/CDN, so that failure to perform is readily identified and the server can be flagged. Maybe that means test makes it into the spec.

But to return to the subject of this thread: at some level of optimization, a client needs to know whether a tool is available in order to employ that tool. Let's take the example from https://github.com/httpwg/http-extensions/pull/1283: response ordering. In a normally multiplexed environment like H3, there are optimizations that a client can do if it can trigger pipelined responses to be delivered serially, at least to an extent that is generally effective. But if it cannot then it has to avoid that request pattern, because the performance cost of the result would be actually worse than the prospective gain of the optimization.

LPardue commented 3 years ago

In a normally multiplexed environment like H3, there are optimizations that a client can do if it can trigger pipelined responses to be delivered serially

To be pedantic, QUIC offers no ordering guarantees across streams. A server cannot guarantee that you'll receive stream data in the order that it was sent. Even if it was all delivered in the exact order, then it's quite likely that the QUIC transport layer will not present that information to the application.

If you really need strict serialization or synchronization across streams then you'll have to build that capability. For instance, QPACK is stateful and requires synchronization across control streams, request streams and response streams. That is some heavily lifting but the gains for compression have been deemed worth the complexity.

Alternatively you can run the client very conservatively, running requests serially or using flow control. But if you can't control how connections are used across purposes, then it is hard to tailor each to its needs. A purpose-built client can do that, a browser-based application cannot.

But the approach of the client advertising a "wide-open connection" and demanding that a general purpose HTTP/2 or HTTP/3 server acts a particular way is not realistically achievable. Servers have a duty of care to their operators, who have a different set of requirement from clients. There's a power balance and a client with too much can easily cause serious problems. This isn't theoretical, Jonathan Looney at Netflix found a bunch of problems last year and the community worked hard to get these fixed. I'll quote them here because they relate to this discussion:

  • CVE-2019-9511 “Data Dribble”: The attacker requests a large amount of data from a specified resource over multiple streams. They manipulate window size and stream priority to force the server to queue the data in 1-byte chunks. Depending on how efficiently this data is queued, this can consume excess CPU, memory, or both, potentially leading to a denial of service.
  • CVE-2019-9512 “Ping Flood”: The attacker sends continual pings to an HTTP/2 peer, causing the peer to build an internal queue of responses. Depending on how efficiently this data is queued, this can consume excess CPU, memory, or both, potentially leading to a denial of service.
  • CVE-2019-9513 “Resource Loop”: The attacker creates multiple request streams and continually shuffles the priority of the streams in a way that causes substantial churn to the priority tree. This can consume excess CPU, potentially leading to a denial of service.
  • CVE-2019-9514 “Reset Flood”: The attacker opens a number of streams and sends an invalid request over each stream that should solicit a stream of RST_STREAM frames from the peer. Depending on how the peer queues the RST_STREAM frames, this can consume excess memory, CPU, or both, potentially leading to a denial of service.
  • CVE-2019-9515 “Settings Flood”: The attacker sends a stream of SETTINGS frames to the peer. Since the RFC requires that the peer reply with one acknowledgement per SETTINGS frame, an empty SETTINGS frame is almost equivalent in behavior to a ping. Depending on how efficiently this data is queued, this can consume excess CPU, memory, or both, potentially leading to a denial of service.
  • CVE-2019-9516 “0-Length Headers Leak”: The attacker sends a stream of headers with a 0-length header name and 0-length header value, optionally Huffman encoded into 1-byte or greater headers. Some implementations allocate memory for these headers and keep the allocation alive until the session dies. This can consume excess memory, potentially leading to a denial of service.
  • CVE-2019-9517 “Internal Data Buffering”: The attacker opens the HTTP/2 window so the peer can send without constraint; however, they leave the TCP window closed so the peer cannot actually write (many of) the bytes on the wire. The attacker then sends a stream of requests for a large response object. Depending on how the servers queue the responses, this can consume excess memory, CPU, or both, potentially leading to a denial of service.
  • CVE-2019-9518 “Empty Frames Flood”: The attacker sends a stream of frames with an empty payload and without the end-of-stream flag. These frames can be DATA, HEADERS, CONTINUATION and/or PUSH_PROMISE. The peer spends time processing each frame disproportionate to attack bandwidth. This can consume excess CPU, potentially leading to a denial of service. (Discovered by Piotr Sikora of Google)

I can't speak for all HTTP/2 server implementations. But the ones I am familiar with go to great efforts to protect their operators, often without their knowledge. Asking an L7 web server application to promise something it cannot is defective.

LPardue commented 3 years ago

I'd also like to present another parallel situation with relation to video.

https://developer.mozilla.org/en-US/docs/Web/API/HTMLMediaElement/canPlayType

canPlayType() reports how likely it is that the current browser will be able to play media of a given MIME type.

Returns DOMString

probably
    Media of the type indicated by the mediaType parameter is probably playable on this device.
maybe
    Not enough information is available to determine for sure whether or not the media will play until playback is actually attempted.
"" (empty string)
    Media of the given type definitely can't be played on the current device. 

If the local system can't even assure itself of capability, then I'm pessimistic about how remote endpoints would treat signals about priority.

wilaw commented 3 years ago

@LPardue - canPlayType is actually a counter-argument to the importance of trusted signals. The vagueness of its replies means that modern MSE development is moving in the direction of utilizing https://developer.mozilla.org/en-US/docs/Web/API/Media_Capabilities_API as a replacement. This API returns explicit boolean responses as to whether a given MIME type can be decoded, whether it can be done smoothly and done in a power efficient manner. These clear signals help players make the correct decision about content selection.

LPardue commented 3 years ago

Thats a good point Will, thanks. I'm not disputing that a certain kind of client would find a trusted signal useful. I do think that it is practically very difficult for an HTTP server to say anything but probably or maybe. It can't even decisively say no for some types of priority signal.

@wilaw do you think there is interest and a path to something like a "Transport_Capabilities_API" that might rely on something like .well-known mentioned above?

wilaw commented 3 years ago

@LPardue - a hypothetical Transport_Capabilities_API is moving the problem to the browser, which is currently in no better position to deterministically confirm whether a server will support a prioritization request. The HTTP message from a client to a server is syntactically a 'request' and not a 'command'. Retro-fitting conformance via /.well-known/ or header schemes seems awkward.

I love that fact that I can serve HLS to millions of people using the same H1, H2, H3 protocols that deliver other binary blobs for other applications. At the same time I am very sympathetic to what Roger is asking for on this thread. I think the solution may lie in the near-term next-gen protocols, of which WebTransport (QuicTransport and Http3Transport) is the most exciting. Since both are actively being developed at IETF, there is perhaps still time to add to the core design the concept of prioritization confirmation or prioritization enforcement?

LPardue commented 3 years ago

Thanks for the perspective.

I too am sympathetic.

But QUIC itself has been submitted to the IESG, and says very little about prioritisation. That ship has sailed. HTTP/3 has delegated its prioritisation to this draft, which after healthy discussion has pulled up anchor and is getting ready to depart.

What's being asked is for a late stage reconsideration of an ethos that builds on years of experience with H2; that signals are a hint and a client can't expect the server to act in any particularly strict way. For the ~2 years we've been discussing binning the H2 tree, the one thing that people seemed happy to keep is that ethos.

As an implementer of a QUIC stack and H3 extensible priorities I don't have confidence that I can state with 100% truth that I will follow the clients instruction. So I'm best off saying yes I did and let the client deal with the outcome. Which to me seems no better than sailing nothing and letting the client deal with the outcome.

LPardue commented 3 years ago

I can't speak for what WebTransport will do. I suspect they may have similar challenges when it comes to signalling priorities and scheduling but perhaps the green field lets them design how they wish. The downside of course is that its not conventional HTTP. We've seen some of this with using WebSockets to decorate or assist HAS: it's possible but doesn't benefit from huge scaling afforded by HTTP.