envoyproxy / envoy

Cloud-native high-performance edge/middle/service proxy
https://www.envoyproxy.io
Apache License 2.0
25.13k stars 4.82k forks source link

http: RFC 2616 (from 1999) allows transfer-encoding:identity with content-length but Envoy rejects for http 1.1 #26652

Closed jmarantz closed 1 year ago

jmarantz commented 1 year ago

Specifying Transfer-Encoding:identity with Content-Length appears to be valid in HTTP RFC.

Specifying Transfer-Encoding:chunked with Content-Length is explicitly invalid in HTTP RFC, but proper handling of it is actually specified in the RFC anyway. I think https://www.rfc-editor.org/rfc/rfc2616#section-4.4 is authoritative, at the end of the section (just before 4.5).

Messages MUST NOT include both a Content-Length header field and a non-identity transfer-coding. If the message does include a non- identity transfer-coding, the Content-Length MUST be ignored.

Originally posted by @jmarantz in https://github.com/envoyproxy/envoy/issues/14004#issuecomment-1502486583

jmarantz commented 1 year ago

@alyssawilk @birenroy @RyanTheOptimist @mattklein123

Any other protocol experts I should bring in to weigh on whether my interpretation is correct and what should be done in Envoy?

I haven't done a unit-test repro yet; this arose from a product issue and I wanted to capture it having just looked it up.

jmarantz commented 1 year ago

@yanavlasov

geojaz commented 1 year ago

thanks @jmarantz this was our issue... following along!

jmarantz commented 1 year ago

Bug appears to be here: https://github.com/envoyproxy/envoy/blob/0dbb3dcb335fc08ecf5544840602c9e2465f6d0b/source/common/http/http1/codec_impl.cc#L897

It may be in http2 also; maybe http3?

RyanTheOptimist commented 1 year ago

Curious edge case. It. looks like RFC 7230 updated the language.

A sender MUST NOT send a Content-Length header field in any message that contains a Transfer-Encoding header field.

https://www.rfc-editor.org/rfc/rfc7230#section-3.3.2

That would suggest to me that as of 7230, at least, Transfer-Encoding: identity + Content-length is prohibited, and Envoy can rightly reject it.

As for HTTP/2 (and HTTP/3, I believe) the spec prohibits Transfer-Encoding:

An endpoint MUST NOT generate an HTTP/2 message containing connection-specific header fields. This includes the Connection header field and those listed as having connection-specific semantics in Section 7.6.1 of [HTTP] (that is, Proxy-Connection, Keep-Alive, Transfer-Encoding, and Upgrade). Any message containing connection-specific header fields MUST be treated as malformed (Section 8.1.1).

https://datatracker.ietf.org/doc/html/rfc9113#section-8.2.2

What do you think?

jmarantz commented 1 year ago

Thanks Ryan! In particular that one says:

The "identity" transfer coding token has been removed. (Sections 3.3 and 4)

The spec I was quoting was from 1999 and that one is from 2014. I found the 1999 one by searching for "http rfc" and the first one that came up was the old one.

I guess there are still some servers that have not been updated to the latest. As a proxy, what should we do? My inclination I guess is to allow the 1999 syntax if it doesn't interfere with the 2014 syntax. But I'm sure there will be strong opinions.

RyanTheOptimist commented 1 year ago

Thanks Ryan! In particular that one says:

The "identity" transfer coding token has been removed. (Sections 3.3 and 4)

The spec I was quoting was from 1999 and that one is from 2014. I found the 1999 one by searching for "http rfc" and the first one that came up was the old one.

Yeah, makes sense. 2616 has this rather amusing header at the top:

Obsoleted by: 7230, 7231, 7232, 7233, 7234, 7235

So it's obsolete at this point.

I guess there are still some servers that have not been updated to the latest. As a proxy, what should we do? My inclination I guess is to allow the 1999 syntax if it doesn't interfere with the 2014 syntax. But I'm sure there will be strong opinions.

My instinct is that Envoy should follow 7230, at least by default. If someone has a use case for a non-7203-compliant use case, we can consider adding a knob for that, I suppose. (I think this is roughly the plan for unified header validation in general. Follow the RFC by default, but add knobs to support non-compliant legacy behavior)

jmarantz commented 1 year ago

That makes sense Ryan. Would it also make sense to have an 1999-compatibility mode that was lenient, and have Envoy strict by default? Then proxies are used in front of backends that might be old could elect to utilize compatibility mode. I am not sure if UHV is all that's needed here or whether we need a specific 1999-compatible mode.

RyanTheOptimist commented 1 year ago

Yeah, I think it's reasonable to have a more lenient mode for dealing with endpoints that support RFC 2616 (from 1999) but not RFC 7230 (from 2014). That being said, there are a lot of changes since RFC 2616 (https://www.rfc-editor.org/rfc/rfc7230#appendix-A.2) and I'm not sure we'd necessarily want to loosen all of them with the same mode, but maybe? But specifically for transfer-encoding: identity and content-length, this seems reasonable to me.

alyssawilk commented 1 year ago

But specifically for transfer-encoding: identity and content-length, this seems reasonable to me.

yeah this one seems fine. allowing generic mixing and matching seems dubious from a security perspective so I'd definitely prefer to err on the side of caution and add exemptions as needed.

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.

github-actions[bot] commented 1 year ago

This issue has been automatically closed because it has not had activity in the last 37 days. If this issue is still valid, please ping a maintainer and ask them to label it as "help wanted" or "no stalebot". Thank you for your contributions.