Closed Maaarcocr closed 2 years ago
How should we handle HTTP/2 -> HTTP/1 proxy. Should this header be added?
This may be wrong, but given that you have to wait all trailers before being able to decide what to put in the Trailers
header, you may as well just add all trailers in the headers, which may be easier, as you do not have to think about chunk encoding and so on.
That would break streaming?
I looked at other proxies, like envoy grpc proxy, and they don't really support streaming all together (I guess partially because of trailers).
I'm fine with something like your proposed change provided it doesn't go against any RFCs, I agree we would like to support gRPC. I need to review this in detail to make sure this is the right behaviour change. I think it's probably acceptable.
This is great @Maaarcocr, and should fix some issues with GRPC. Might be worthwhile adding a test to document the behavior.
Also, I can't really tell what's up with the CI failures but it seems like other branches are failing as well, so it might not be related to this change in particular.
I'm trying to find a basis or reason for allowing this.
https://datatracker.ietf.org/doc/html/rfc7540 page 58 shows an example of an HTTP/1 request upgraded to an HTTP/2 request which includes a direct 1:1 usage of trailer
header as per HTTP/1.
HTTP/1.1 100 Continue HEADERS
Extension-Field: bar ==> - END_STREAM
+ END_HEADERS
:status = 100
extension-field = bar
HTTP/1.1 200 OK HEADERS
Content-Type: image/jpeg ==> - END_STREAM
Transfer-Encoding: chunked + END_HEADERS
Trailer: Foo :status = 200
content-length = 123
123 content-type = image/jpeg
{binary data} trailer = Foo
0
Foo: bar DATA
- END_STREAM
{binary data}
HEADERS
+ END_STREAM
+ END_HEADERS
foo = bar
This seems to imply at least it's compatible, but not that it's required.
On page 51, it explicitly states:
4. optionally, one HEADERS frame, followed by zero or more
CONTINUATION frames containing the trailer-part, if present (see
[[RFC7230], Section 4.1.2](https://datatracker.ietf.org/doc/html/rfc7230#section-4.1.2)).
The only advice I can find proximal to that section is:
[4.4](https://datatracker.ietf.org/doc/html/rfc7230#section-4.4). Trailer
When a message includes a message body encoded with the chunked
transfer coding and the sender desires to send metadata in the form
of trailer fields at the end of the message, the sender SHOULD
generate a Trailer header field before the message body to indicate
which fields will be present in the trailers. This allows the
recipient to prepare for receipt of that metadata before it starts
processing the body, which is useful if the message is being streamed
and the recipient wishes to confirm an integrity check on the fly.
Trailer = 1#field-name
I could not find anywhere that states that trailer
in the header MUST be present, only SHOULD.
Does this mean semantically we should accept the same logic in HTTP/1?
It seems like it's a SHOULD which is what I've also seen around. I was unsure about HTTP1.1 but I guess even if chunk encoded you need the trailer header only to do some integrity checks, not really for any other reason.
A better solution, for both HTTP2 and HTTP1 would be to let the user decide what to do when the trailer header is present and the trailers don't match it (also maybe what to do if the header is missing all together).
I do think that not requiring it is the right thing to do. We could approach it this way:
1) first, just don't require it 2) possibly add levers and whistles that the user of the library can use to perform integrity checks if the header is present 3) possibly allow users to make it such the library keeps the current behaviour if the trailer header is missing
Trailers as a semantic are quite tricky to fit within the current model for HTTP request-response and need explicit support in the headers semantics. So I need to make sure it's going to work as expected.
Ok, that makes sense. Do you reckon we could merge the http2 changes sooner than the whole overall changes?
Mostly asking because, internally, we do rely on async-http and this change would unlock us. I know it's not really something that concerns you, but it would help us plan a bit if we knew when we could expect the http2 changes to get merged.
(The work we're doing internally may also get open sourced and it's related to gRPC, but all of that is TBD)
Does gRPC optionally use trailers as a header or not at all?
afaik not at all. Maybe some implementations may, but the ones that us the shared C implementation (Ruby, Python, C#) don't. I'm pretty sure the go one doesn't.
Does gRPC use any of the same semantics for headers? like, accept
, content-type
etc. Do they have the same parsing rules as HTTP?
I will try to get this merged by the weekend. There are a couple of other changes that need to be made to Protocol::HTTP::Headers
.
yes grpc is built on top of HTTP2, so when it comes to it, in order to talk to it you need a fairly basic HTTP2 client that sends the right headers and packs the body message the right way.
So yes, it uses fairly standard http headers. It also has its own trailers (like grpc-status), which are still just normal headers but only have their own semantics within grpc connections.
While HTTP requires:
Unless the request includes a [TE](https://httpwg.org/specs/rfc7230.html#header.te) header field indicating "trailers" is acceptable, as described in [Section 4.3](https://httpwg.org/specs/rfc7230.html#header.te), a server SHOULD NOT generate trailer fields that it believes are necessary for the user agent to receive. Without a TE containing "trailers", the server ought to assume that the trailer fields might be silently discarded along the path to the user agent. This requirement allows intermediaries to forward a de-chunked message to an HTTP/1.0 recipient without buffering the entire response.
It seems like gRPC does not require any such negotiations. The semantics of headers and trailers are very similar but different enough that it's tricky to support both semantics in one package.
HTTP semantics "SHOULD":
While the lower level protocols support this, it's hard to know what the design trade offs are, i.e. sacrifice strict HTTP semantics to allow gRPC to live in the same code base. async-http
is for HTTP semantics.
My understanding is gRPC uses the same protocol as HTTP/2 but it's not sharing the same semantics. Because it has trailers with none of the semantics required for HTTP to work correctly and this includes proxies and the ability to correctly proxy requests with trailers, from HTTP/2 to HTTP/1.
I'm hesitant to adopt a change here which violates "SHOULD" recommendations in the RFCs. Of course, async-http
is not perfect and doesn't follow te:trailers
but we should add that, and it will be another problem for gRPC.
As such, I think I'll cut a new gem, async-grpc
which implements the grpc client and server semantics. Unless you have some other idea.
Basically it boils down to, is gRPC really compatible with HTTP semantics? Or is it just using the same protocol/wire format?
I've asked this question here: https://groups.google.com/g/grpc-io/c/Mw-HeAV_ZZc
Another angle is, do we relax HTTP/2 protocol checks because it's acceptable in the RFCs? I need to know if HTTP/2 actually relaxes the semantic requirements around trailer
header. From my initial read of RFC7540, it's not really clear.
@mnot if you have any input on this it would be most appreciated.
https://github.com/grpc/grpc/blob/master/doc/PROTOCOL-HTTP2.md
This document serves as a detailed description for an implementation of gRPC carried over HTTP2 framing. It assumes familiarity with the HTTP2 specification.
"over HTTP2 framing"
Trailer handling is one of the things that's changed in the newest HTTP specs (about to be published Any Day Now) -- see here. If you have any questions, glad to try to help.
I don't fully agree that grpc is changing HTTP2 semantics, as grpc is pretty much built on top of it and it just sees it as a transport layer.
Some gRPC server implementations error back to you if you do not send the TE=trailers header. I've learnt this the hard way while trying to reimplement a grpc client, precisely the one that share the C implementation, so they are compliant with the HTTP spec. I think the go implementation actually allows you to either set or not set TE=trailers and it will do the right thing.
There was a medium article (that now got deleted for some reason) where the author described how to send grpc requests using cURL and its http2 client.
All in all, I think grpc doesn't violate the HTTP2 spec and it's actually one of the few cases where trailers are used.
Okay so the expectation is to use te: trailers
but not the trailer header itself which is “SHOULD” in the relevant RFCs?
Does:
A sender that intends to generate one or more trailer fields in a message should generate a Trailer header field in the header section of that message to indicate which fields might be present in the trailers.
Apply to gRPC or not?
A well formed HTTP response "SHOULD" generate a "trailer" header according to RFCs. I guess it won't hurt gRPC implementations and makes it compatible with HTTP semantics.
The question is:
How should the client behave when receiving an HTTP/2 response which includes trailers, but does not include a trailer
header?
Currently, an HTTP/2 response which:
trailer
header, andBy this PR only, a HTTP/2 response which:
trailer
header, andasync-http
because it streams responses and was not aware that there would be trailers at the time the stream started, and thus may not have made the correct choice about how to proxy, e.g. HTTP/1.1 might use fixed content-length but with trailers it should use chunked encoding. In other words, trailers are treated like headers, and won't be sent because by the time they are received, the headers are already sent.The solution would be to assume that HTTP/2 can always generate trailers, but there is some extra overhead involved in this. It will force HTTP/2 -> HTTP/1 to always use chunked encoding even if it was otherwise unnecessary.
Basically, to be completely compatible with HTTP semantics, gRPC should specify:
trailer: grpc-status, grpc-message
Looking at https://github.com/grpc/grpc/blob/master/doc/PROTOCOL-HTTP2.md#responses I wonder if that was implied by
Trailers → Status [Status-Message] *Custom-Metadata
But I'm assuming servers don't implement this?
@ejona86 do you have any feedback/insight into this spec/design?
This was an interesting interpretation of the relevant docs: https://github.com/httpwg/http-core/issues/18#issuecomment-453593658
I understand your concerns about grpc possibly skewing away from proper HTTP semantics and just adhering to its framing.
At the same time, would it make sense to maintain 2 http implementations whose only difference is that one errors out when the "Trailer" header is not sent and the other doesn't?
I have a working demo of using async-http as a grpc client and the only change I had to make for it to work was to allow trailers without the "Trailer" header.
After sleeping on this, and reflecting on the relevant RFCs, I believe that the te: trailers
and trailer
header is completely useless in practice given the state of how things are being implemented. The word "SHOULD" is clearly not strong enough and means that any logic depending on this header cannot be relied on in general for the correct functioning of HTTP client/server.
It's hard for HTTP/1 since there are clear trade offs (fixed content-length
body vs chunked encoding).
Therefore, I think we can accept this PR and we just have to accept the semantic complexities that come from it.
Yes, this PR is obviously the right thing to do. A requirement on a sender (especially a SHOULD) does not place any onus on a recipient to check that the sender is doing the right thing -- it's encouraging good behaviour (hence, SHOULD). If it were necessary for the recipient to check it, that would have been said explicitly in the spec.
@mnot thanks for your feedback.
this PR is obviously the right thing to do
Sorry to nitpick, but it really wasn't obvious to me hahah.
How does a client/server in this case meaningfully do anything w.r.t. a SHOULD requirement? Emit warnings? How about gRPC in this case where it's essentially ignoring the SHOULD on a systematic basis. If standard X says SHOULD and standard Y building on X ignores that, it feels like a mistake to me.
How does a client/server in this case meaningfully do anything w.r.t. a SHOULD requirement? Emit warnings?
A recipient doesn't need to do anything. This requirement is there to encourage senders to make the information available, nothing more.
How about gRPC in this case where it's essentially ignoring the SHOULD on a systematic basis. If standard X says SHOULD and standard Y building on X ignores that, it feels like a mistake to me.
gRPC isn't a standard, and doesn't use HTTP particularly well. That's on them.
Okay, I made updated releases of all the relevant gems with this new behaviour.
Fixes https://github.com/socketry/async-http/issues/91