incremental forwarding of HTTP messages is not a guaranteed property of HTTP semantics

kazuho commented 5 months ago

Chunked Oblivious HTTP Messages relies on incremental delivery of HTTP request; as the client sends some chunks of an HTTP POST request, the server is expected to respond to them by sending chunks of an HTTP response.

This approach contradicts to what HTTP semantics states and has interoperability issues with proxies that buffer the entire request before starting to forward the request to the backend servers.

Specifically, RFC 9110 section 7.6 states:

An HTTP message can be parsed as a stream for incremental processing or forwarding downstream. However, senders and recipients cannot rely on incremental delivery of partial messages, since some implementations will buffer or delay message forwarding for the sake of network efficiency, security checks, or content transformations.

When these buffering intermediaries are involved, Chunked Oblivious HTTP Messages will not work, and clients would see timeouts.

Considering that what we are developing is an application of HTTP, it would make sense to design the new protocol in a way that it does not add restrictions to HTTP semantics.

One way of moving forward would be to use extended CONNECT instead of HTTP POST for opening the bi-directional stream on which we exchange Chunked OHTTP Messages.

martinthomson commented 5 months ago

Another option is to acknowledge that intermediaries with these properties that are involved in a particular deployment can be identified and fixed. I understand that Tommy discovered a few like this after implementing and deploying this.

The other thing is that this doesn't necessarily invalidate the design. The goal is to enable incremental processing, which can be achieved with buffering. What is not possible is things like 100-continue (which remains a misfeature for the reasons you describe, at least in my mind).

kazuho commented 5 months ago

@martinthomson

Another option is to acknowledge that intermediaries with these properties that are involved in a particular deployment can be identified and fixed. I understand that Tommy discovered a few like this after implementing and deploying this.

I'm not sure if such an approach would be possible or beneficial for the ecosystem.

HTTP/2 and HTTP/3 allows clients a issue large number of requests at once (e.g., 100). But servers often do not process all requests in parallel. It is often the case that the number of requests that are processed concurrently is rather small (e.g., 10), while other requests are "buffered." By doing so, the backend applications are protected from DoS attacks, while the bandwidth of the connection is fully utilized.

To paraphrase, the ability to buffer requests is a key pillar of HTTP/2 and 3. It is not a misfeature.

CONNECT and extended CONNECT are exceptions. That is fine, because every intermediary would recognize that they are exceptions. When receiving too many requests at once, they have the option to respond with 429 Too Many Requests.

But POST does not work like that. Proxies are designed to be transparent to the content-type being used.

The problem with current design is that it requires proxies to have a list of mime types that have to be processed like CONNECT. I do not see a reason to choose such an approach when we can choose to use extended CONNECT that provides the necessary semantics.

martinthomson commented 5 months ago

I'm saying 100-continue is the misfeature, not the buffering.

An arrangement like you describe is beneficial, but it might not be always the right answer, depending on the backend capabilities and the application. OHTTP relays do provide some DoS mitigation, but maybe limiting request volume is not the axis along which the relay needs to operate its protections.

I'm not suggesting that this is a per-content-type thing, but a per-resource or really per-service thing. If anything at all. The design works well enough with buffering at an intermediary (modulo 100-continue and other similar arrangements). I'd go futher and say that there are privacy benefits to buffering.

kazuho commented 5 months ago

I'm not suggesting that this is a per-content-type thing, but a per-resource or really per-service thing. If anything at all. The design works well enough with buffering at an intermediary (modulo 100-continue and other similar arrangements).

Firstly, the design does not work well with buffering, at least with intermediaries that buffer the whole request or a non-negligible amount of bytes before starting to forward request to the backend. As stated, such behavior is common among the intermediaries.

Secondly, this is the first time I've heard HTTP clients complaining about the buffering behavior. That makes me believe that Chunked OHTTP is asking for an exception, rather than turning on a per-resource or per-service config.

kazuho commented 5 months ago

If I may ask, what is the benefit of choosing HTTP POST over extended CONNECT or websockets, when use of HTTP POST adds restrictions to HTTP semantics and has actual interoperability issues? In https://github.com/ietf-wg-ohai/draft-ohai-chunked-ohttp/issues/19#issuecomment-2024396101 you state you've already seen them.

martinthomson commented 5 months ago

In talking with others (at your employer, even), POST is considerably less complex to deploy than CONNECT. At the point you engage CONNECT, then MASQUE starts looking a whole lot better. It's much more expensive and the ends, but still.

Secondly, this is the first time I've heard HTTP clients complaining about the buffering behavior.

That's not coming from me. Buffering at intermediaries is mostly not a problem from my perspective.

kazuho commented 5 months ago

@martinthomson

In talking with others (at your employer, even), POST is considerably less complex to deploy than CONNECT. At the point you engage CONNECT, then MASQUE starts looking a whole lot better. It's much more expensive and the ends, but still.

That sounds like we made the wrong choice with masque (and other extensions being developed on top of extended CONNECT; e.g., connect-tcp). That is because we could equally argue that masque could have been developed on top of HTTP POST, with a per-resource or per-service config in the intermediaries to not buffer the bytes for too long?

Are you actually suggesting that, or is there a reason to believe the situation is different with Chunked OHTTP?

PS. Re my employer, I think our stance is that POST is easier in the short term, but the long term consequences could be different (though the answer might depend on who you asked as well as when).

kazuho commented 5 months ago

FWIW, RFC 9205 section 3.1 states:

This split between generic and application-specific semantics allows an HTTP message to be handled by common software (e.g., HTTP servers, intermediaries, client implementations, and caches) without requiring those implementations to understand the application in use. It also allows people to leverage their knowledge of HTTP semantics without needing specialised knowledge of a particular application.

Therefore, applications that use HTTP MUST NOT redefine, refine, or overlay the semantics of generic protocol elements such as methods, status codes, or existing header fields. Instead, they should focus their specifications on protocol elements that are specific to that application -- namely, their HTTP resources.

I wonder if the current state of Chunked OHTTP requiring intermediaries to start forwarding chunks of requests before receiving entire request violates this MUST (cc @mnot).

tfpauly commented 5 months ago

My initial inclination is to interpret this how @martinthomson does — essentially, that while POST clients "cannot rely on incremental delivery" in general, specific services can indeed provide incremental delivery where it is beneficial and desirable. An OHTTP relay that wants to support chunked OHTTP thus has an incentive to allow incremental delivery. It isn't violating HTTP semantics by buffering, but it isn't being very helpful to its clients. Since OHTTP relays are generally set up for specific services and relationships, this seems like a tractable problem.

From my reading, the text about incremental delivery isn't normative; deployments can choose to do incremental delivery or not, depending on their situation.

I don't think I'd say that chunked OHTTP as a protocol is requiring that intermediaries do incremental forwarding; instead, its the clients of relays that will complain when they aren't getting their desired behavior.

mnot commented 5 months ago

Note that buffering is mostly done by intermediaries doing things like virus scanning, often on-box.

kazuho commented 5 months ago

@mnot

Note that buffering is mostly done by intermediaries doing things like virus scanning, often on-box.

Or intermediaries that try to reduce concurrency to nearby back-end servers, which I think is fairly common.

Looking back the history, one of the reasons people have deployed Nginx in front of preforking application servers (incl. Apache HTTP server + mod_php) is to use Nginx as a buffer for reducing concurrency. Preforking servers cannot handle as many connections as a event-driven server does. Therefore, it makes (or made) sense to let Nginx to buffer highly concurrent but slowly arriving requests until the entire request is received. Once each request is received completely, Nginx opens a connection to the backend and starts forwarding the request.

Proxies do provide knobs for changing the behavior. In case of Nginx, the knob is proxy_request_buffering. But the default is on, meaning that Nginx will try to buffer the entire request.

@tfpauly

My initial inclination is to interpret this how @martinthomson does — essentially, that while POST clients "cannot rely on incremental delivery" in general, specific services can indeed provide incremental delivery where it is beneficial and desirable.

I'm not sure if I agree with the interpretation considering that the MUST NOT in RFC 9205 section 3.1 follows this sentence: This split between generic and application-specific semantics allows an HTTP message to be handled by common software (e.g., HTTP servers, intermediaries, client implementations, and caches) without requiring those implementations to understand the application in use (emphasis mine).

But even if we ignore RFC 9205, I'm still stuck with this question.

To me, all the argument to use HTTP POST for Chunked OHTTTP seem to be equally applicable to WebSockets over H2/H3, masque, or connect-tcp. The arguments can be (or could have been) used to say that masque or connect-tcp should be built on top of HTTP POST, because the servers / intermediaries could be configured to start forwarding bytes without buffering at a per-resource basis.

But we chose to use extended CONNECT. Is there a reason to believe that for POST is a better choice for Chunked OHTTP while extended CONNECT is (was) a better choice for WebSockets over H2/H3, masque, and connect-udp?

ietf-wg-ohai / draft-ohai-chunked-ohttp

incremental forwarding of HTTP messages is not a guaranteed property of HTTP semantics #19