tempesta-tech / tempesta

All-in-one solution for high performance web content delivery and advanced protection against DDoS and web attacks
https://tempesta-tech.com/
GNU General Public License v2.0
613 stars 103 forks source link

garbage after response with trailer and chunked body. #1902

Open RomanBelozerov opened 1 year ago

RomanBelozerov commented 1 year ago

Scope

For response:

HTTP/1.1 200 OK
Content-type: text/html
Last-Modified: Wed, 14 Jun 2023 10:18:29 GMT
Date: Wed, 14 Jun 2023 10:18:29 GMT
Server: Deproxy Server
Transfer-Encoding: chunked
Trailer: X-Token

10
abcdefghijklmnop
10
qrstuvwxyzABCDEF
10
GHIJKLMNOPQRSTUV
4
WXYZ
0
X-Token: value

Tempesta forwarded h2 response:

:status: 200
content-type: text/html
last-modified: Wed, 14 Jun 2023 10:19:11 GMT
date: Wed, 14 Jun 2023 10:19:11 GMT
trailer: X-Token
x-token: value
via: 2.0 tempesta_fw (Tempesta FW pre-0.7.0)
content-length: 52
server: Tempesta FW/pre-0.7.0

abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ

and garbage as separate TCP frame:

b'Token: value\r\n\r\n'

Testing

encoding.test_encoding.TestH2ChunkedWithTrailer

krizhanovsky commented 11 months ago

We won't be able to move a trailing header in front of the HTTP message in streaming mode (#498). Probably we'll need just to place a HEADERS frame around the HTTP/1 trailer header instead of moving it to the first (main) HEADERS frame.

Depends on #498

kingluo commented 5 months ago

Not related to this issue directly, but why do we merge the trailer headers into the first HEADER frame? According to the HTTP/2 RFC, the HTTP/1.1 trailer headers should be converted into a sperate HEADER frame (the last HEADER frame): https://datatracker.ietf.org/doc/html/rfc9113#section-8.8.5

krizhanovsky commented 5 months ago

As discussed we should place the trailer header in the trailing HTTP headers frame after the body. The same functionality in streaming (#498) and current operation.

kingluo commented 5 months ago

The issue source was located. When we contruct frames for HTTP/2 response and reuse the same skbs from the backend, we forget to adjust the skb size according to the trailer headers (if exist) size, then it come with extra partial HTTP/1 response to the client. Then, the client will send a GOAWAY to temepesta due to wrong frame size, and tempesta sends back the TCP RESET. (Strangely, sometimes the packets are swalloned at somewhere so that the tcpdump cannot capture them.)

https://github.com/tempesta-tech/tempesta/blob/7dc50e1da851ccf6606cbfdfd5cca4abf481e116/fw/http_msg.c#L1045C1-L1050C39

kingluo commented 5 months ago

As discussed we should place the trailer header in the trailing HTTP headers frame after the body. The same functionality in streaming (#498) and current operation.

Yes, maybe we should create another PR in future to fix this compatibility issue? After all, very little HTTP client tools care about the trailer header transformation, even in grpc case, the backend side is mostly HTTP/2, which is not that cases in tempesta.

krizhanovsky commented 5 months ago

As discussed we should place the trailer header in the trailing HTTP headers frame after the body. The same functionality in streaming (#498) and current operation.

Yes, maybe we should create another PR in future to fix this compatibility issue? After all, very little HTTP client tools care about the trailer header transformation, even in grpc case, the backend side is mostly HTTP/2, which is not that cases in tempesta.

I think this is nothing wrong to always put trailer headers to a separate HEADERS frame after response body, so probably we don't need to address this even in future. Meantime we have have a tasks for upstream HTTP/2 https://github.com/tempesta-tech/tempesta/issues/1125 , which depends on upstream TLS though.

kingluo commented 5 months ago

I think this is nothing wrong to always put trailer headers to a separate HEADERS frame after response body, so probably we don't need to address this even in future.

Yes, but now they're mixed in the first HEADER frame, so should I put them into the last seperate HEADER frame? In the same PR? I mean, this PR is just a bugfix, and these two problems are orthogonal.

krizhanovsky commented 5 months ago

I think we still need this for #498, which is also in this milestone, so probably it can be done in two PRs, but it doesn't make sense to postpone one of the PRs. I.e. let's do everything now, but in 2 PRs.

kingluo commented 5 months ago

I think we still need this for #498, which is also in this milestone, so probably it can be done in two PRs, but it doesn't make sense to postpone one of the PRs. I.e. let's do everything now, but in 2 PRs.

okay. I'll handle it soon later.

krizhanovsky commented 4 months ago

Minutes from the today call, @kingluo @const-t @EvgeniiMekhanik please comment :

  1. 2095 combines 2 fixes: the problem with skb boundaries producing the garbage in this bug report and the trailer headers placement in the cache. #2095 should be reworked to fix only the first problem with skb.

  2. trailer headers must remain at the end of a forwarded HTTP response, for both HTTP/1 and HTTP/2, to simplify further work on #498 and to solve the problem with not removed trailer: X-Token
  3. HTTP/1 requires chunked transfer encoding for trailer headers, so we need to wrap the whole response body into one chunk of full body size (this should be updated with #498 - need to update the task by acknowledging this design). Make sure we do not set Content-Length for such responses!
  4. It seems we should store trailers separately in cache, also after body, maybe just adding a references from the main descriptor of a cache entry. Probably this is the most topic to be discussed.
  5. With streaming mode in #498 it's unclear how headers mangling could work with trailers, so let's just update the wiki, that the directives do not affect trailers.

Points (2)-(4) should be implemented in a new PR, which will actually close the issues. Please also update appropriate tests having that we'll have Transfers-Encoding: chunked instead of Content-Length for some responses.

kingluo commented 4 months ago

We should reserve the fact of fixed-length or chunked for the response from the backend, no matter whether the cache is enabled (if the cache is enabled, then this fact, with the trailer is saved as part of the response of course). So, when it turns to forward the response (no matter from the backend or the cache), it replays the format, no matter whether the downstream/client is HTTP/1 or HTTP/2 (of course, HTTP/2 is chunked only, so for HTTP/2, both formats are converted to one format naturally).

In Nginx, the rules above are true and even more restricted, the proxy_pass (corresponding to our backend support) does not support trailers forwarding, it will strip the trailers, and even, the trailer: xxx is not removed too. Instead, it only supports this feature in grpc_pass, because in reality, only grpc programs return status via trailer headers. Hence, it requires the proxy to be capable of forwarding them. But for HTTP/1, almost all modern browsers do not parse the trailers, i.e. no practical HTTP/1-based apps use trailers (no wonder it's ignored by Nginx).

krizhanovsky commented 3 months ago

The skb part was done in #2095, but there is still part about leaving the trailer header as trailer and encode it in HEADERS frame after DATA frames. Probably we still need to delete Trailer: X-Token from headers.