envoyproxy / envoy

Cloud-native high-performance edge/middle/service proxy
https://www.envoyproxy.io
Apache License 2.0
24.74k stars 4.75k forks source link

Unify header validation across all codecs (H1, H2 and H3) #10646

Open yanavlasov opened 4 years ago

yanavlasov commented 4 years ago

Presently H1 and H2 codecs use header validation which is a mix of the codec specific and some checks from nghttp2 library on top of it. This leads to inconsistencies in header validation across codecs and makes header validation hard to audit.

For more information see design specifications.

This change will include:

  1. Adding header validation according to the HTTP spec.
  2. Converting codecs to use unified header validation facility. This will only be applied for Balsa and oghttp2 codecs.

Deployment plan:

  1. Add opt-in for universal header validation for Balsa and oghttp2 in compatibility mode (fully compatible with the http-parser and nghttp2)
  2. After bake-in interval (i.e. one full release cycle) change the feature to opt-out
  3. Turn-off compatibility features to bring HTTP validation to RFC compliance one by one.
mattklein123 commented 4 years ago

cc @danzh2010 for HTTP/3

danzh2010 commented 4 years ago

QUICHE has its header validation done before handing it to codec. Is it suppose to be following the standard? If so QUICHE should already does so.

mattklein123 commented 4 years ago

@danzh2010 the main thing we have to figure out here is how to share the various checks between all 3 codecs (QUICHE being one). Right now we have consistency issues just between 2, so when we add HTTP/3 the problem is even bigger. Specifically, we will have to implement header underscore ignore/drop in HTTP/3 to match HTTP/1 and HTTP/2 so this will be a forcing function to figure this out.

danzh2010 commented 4 years ago

We can add extra validation steps after QUICHE delivers request header to codec, so it's something doable. But might not worth to check what the standard requires for a 2nd time.

danzh2010 commented 4 years ago

Specifically, we will have to implement header underscore ignore/drop in HTTP/3 to match HTTP/1 and HTTP/2 so this will be a forcing function to figure this out.

As to underscore in header field, why wasn't it an http filter instead?

mattklein123 commented 4 years ago

As to underscore in header field, why wasn't it an http filter instead?

I think mostly for efficiency reasons as this is a potential attack vector, though I suppose in hindsight we could have done this at the HCM layer during decode headers. @yanavlasov @alyssawilk any thoughts here? We can change this later without any config implications though we would probably want to move the stats around.

alyssawilk commented 4 years ago

I think it was added as a commonly needed security fix - I think at some point we could refactor much of the HCM transformation into a shared filter, but in general most security work (making sure there's no smuggling, making sure there's no weird characters) go directly in the base classes. Moving one out would be weird, and moving them all out would be dangerous :-P

danzh2010 commented 4 years ago

What is the security concern for underscore in headers?

mattklein123 commented 4 years ago

What is the security concern for underscore in headers?

Some applications treat _ and - as interchangeable (holdover from CGI days) and this opens up a fairly trivial security bypass vector in some cases. NGINX ignores headers with _ in them by default for this reason.

but in general most security work (making sure there's no smuggling, making sure there's no weird characters) go directly in the base classes.

+1, I don't think this stuff belongs in a filter, but I feel quite strongly that it belongs in common code, probably in the HCM in decodeHeaders. We need to make sure all of our HTTP level checks are done consistently across all codecs, and optimally not done twice (as nghttp2 does a bunch of them by default).

antoniovicente commented 4 years ago

This may be a topic for a separate feature request, but it would be good to run header validation relatively often in debug modes so we can detect cases where filters make modifications that break header invariants. Re-running header checks after each filter that can modify headers and/or before serialization may help us detect issues early (e.i. in integration tests)

alyssawilk commented 3 years ago

I think this now goes beyond codec ingress into "untrusted data ingress"

We should probably be doing the trusted header checks after ext_proc and wasm to make sure 1) no x-envoy headers added by untrusted users 2) no spec violating or standard envoy guards (e.g. no Transfer-Encoding and Content-Length headers coexisting 3) no invalid input (bad characters, whitespace etc)

cc @PiotrSikora @mathetake @gbrail @htuch

htuch commented 3 years ago

Just ext_proc/wasm or all HTTP filters?

antoniovicente commented 3 years ago

For efficiency reasons it may make sense to have options to skip some of these checks after specific filters that are trusted enough to not introduce badness. But effectively yes, it would be good to perform this sort of consistency check after each HTTP filter that can perform request or response header transformations.

PiotrSikora commented 3 years ago

Note that we would want to perform those checks inside the Wasm HTTP filter at the time when a change to headers happens, so that we could return an error back to the caller, not afterwards.

yanavlasov commented 2 years ago

Documenting known differences between header validation for H/1 and H/2 codecs (H/3 later).

Comment checks among the protocol are listed in the table below:

Validation nghttp2 http-parser
Method is one of the hardcoded choices no yes
Authority/URI syntax is according to RFC 3986 section 3.2 no, charset only yes (twice)
Header name character set is validated according to the RFC 7230 section 3.2 yes yes (twice)
Header value character set is validated according to the RFC 7230 section 3.2 yes, reject request if this is pseudo header, drop header otherwise yes, reject request
Content-Length validated to be a number yes yes

In addition each codec performs protocol version specific checks:

H/2 (nghttp2) request and response protocol specific validation:

  1. Check that header name characters are lowercase (reject request if any header name has uppercase character).
  2. :scheme header value character set is validated according to RFC 3986 section 3.1
  3. Validate charset of pseudo headers, disallowed connection-specific headers, "TE: trailers"

H/1 (http-parser) protocol specific validation:

  1. Version in the request line.
  2. Response line syntax is verified.
gbrail commented 2 years ago

It would be great to clarify this. The situation right now with ext_proc is that I think that certain header manipulations will result in an assertion failure that crashes Envoy, which really worries me! It'd be great if we could instead have a standard library that returns a status as to whether a particular header manipulation is valid, and configuration in ext_proc and ext_authz that says what to do with an invalid manipulation.

For example, when Envoy gets a response back from an ext_proc processing server making an invalid request to change a header, we have the choice of ignoring the request to change the header (and logging or incrementing a counter), or ignoring the ext_proc processing server for the rest of the HTTP transaction (we do that if the server sends back response messages in the wrong order). We also have the choice of letting the HTTP transaction keep processing, or failing it with a 500 error, or bombing out Envoy I suppose. All that should be configurable with a default that does not cause Envoy to crash.

Finally, I think that we also need controls on what headers may be changed, so that the proxy has some control over how much it wants to trust the ext_proc processing server. https://github.com/envoyproxy/envoy/issues/14789 addresses these controls, and there's a PR in flight to try and implement them, since the default that we have now is too restrictive for some use cases.

On Wed, Dec 15, 2021 at 7:13 AM yanavlasov @.***> wrote:

Documenting known differences between header validation for H/1 and H/2 codecs (H/3 later).

H/2 (nghttp2) request and response header validation:

  1. header name character set is validated according to the RFC 7230 section 3.2 https://www.rfc-editor.org/rfc/rfc7230#section-3.2
  2. Check that header name characters are lowercase (reject request if any header name has uppercase character).
  3. :authority and host header value character set is validated according to RFC 3986 section 3.2 https://datatracker.ietf.org/doc/html/rfc3986#section-3.2 **Only character set is validated, not the syntax)
  4. :scheme header value character set is validated according to RFC 3986 section 3.1 https://datatracker.ietf.org/doc/html/rfc3986#section-3.1
  5. all other header values character set is validated according to the RFC 7230 section 3.2 https://www.rfc-editor.org/rfc/rfc7230#section-3.2
  6. NOTE: failed checks in the pseudo headers generate an error and failed checks for all other headers caused the header to be dropped (ignored).

H/1 (http-parser) header validation:

  1. Request line is validated:
    • method is one one the hardcoded choices
    • Syntax and character set of the URI is validated (this is double verified by the nghttp2 authority checking function after all headers were parsed, which just checks character set).
    • protocol version syntax is validated
  2. For responses the response line syntax is verified.
  3. header name character set is validated according to the RFC 7230 section 3.2 https://www.rfc-editor.org/rfc/rfc7230#section-3.2
  4. header values character set is validated according to the RFC 7230 section 3.2 https://www.rfc-editor.org/rfc/rfc7230#section-3.2 (this is checked again by H/1 codec using method that does the same check from the nghttp2)
  5. The syntax of the Content-Length is validated (length is a number). Additionally Transfer-Encoding, Upgrade, Connection and Proxy-Connection values are parsed and specific values are recorded in the parser state.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/envoyproxy/envoy/issues/10646#issuecomment-994887866, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAD7I27M7NYD5V4DIE4BZYTURCWB7ANCNFSM4L4RQAKA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

danzh2010 commented 2 years ago

cc @birenroy for oghttp2

htuch commented 2 years ago

@yanavlasov this is a nice summary - it might be easier to read as a table in which rows were equivalent-ish things, e.g. header keys, header values, host/:authority, and then protocol specific.

yanavlasov commented 2 years ago

@gbrail I think this is outside of the scope of this change. Envoy checks that required request or response headers are present and responds with an error status if the check fails. It should not ASSERT. If this is not the case please open a separate issue.

I think a lot of what you have listed is ext_proc specific. However for clarification we could:

  1. Document the required headers in the request or response header maps.
  2. Add API for checking if a header is required for request or response to be proxied and the filter can use this API when deleting headers from the map.

If this sounds good to you, please open a separate issue to track this work.

yanavlasov commented 2 years ago

The unified header validation (UHF) component will do the following:

  1. Validate header map for standard compliance. The validator will be protocol aware and do both validations common to all protocols and protocol specific validations.
  2. Perform header map transformations before request is processed through the filter chain. These include:
    • Path normalization
    • TBD

The UHF component will be implemented as an extension, such that the default behavior can be overridden. The business case for it is transitioning operators of stacks with non-compliant behavior to Envoy based infrastructure.

Development Plan:

  1. Capture header validation behavior in unit tests.
  2. Define API (proto config and internal C++) for the UHF component.
  3. Implement per-request flags that disable header validation for http-parser and nghttp2.
    • nghttp2 will be patched (the patch should be small and exist only until transition to QUICHE H/2 codec)
    • http-parser will be copied into Envoy’s repo as it is no longer maintained in its SoT repo.
  4. Implement flag protected name and value validation common to all protocols and allow it some burn time. Protocol specific validation will still stay in codecs. Remove the runtime flag and start using unified header validation common to all protocols.
  5. Implement flag protected path normalization in the UHF, burn time, flag removal.
  6. Implement flag protected protocol specific header validation in the UHF, burn time, flag removal. These include:
    • H/1, H/0.9 request and response line validation
    • H/2 lowercase header names
    • H/2 pseudo headers, disallowed headers, TE: trailers
    • H/1 TE + CL checks
    • All other H/1 header checks codec_impl.cc

NOTE: steps 4 and 5 can be done in parallel.

danzh2010 commented 2 years ago
  1. Implement per-request flags that disable header validation for http-parser and nghttp2.

We are adding to QUICHE its own header validation logics. Would these verification preferred to be no-op in Envoy?

Do we have an estimated timeline for UHF? QUIC team is pushing hard to add its own header validation because of the existing gap between Envoy Http2 and Http3 validation. But if we are adding UHF soon, QUIC team can adjust the priority.

yanavlasov commented 2 years ago

@danzh2010 The plan is to start working on it early in Q1 22, as it also has impact on H/1 codec transition.

ameily commented 2 years ago

I'll be creating multiple issues to track each task for the larger goal of a unified header validation (uhv) component, based on the tech spec.

ameily commented 2 years ago

I've created the issues. Here is the rough order that I think we will work on them:

  1. https://github.com/envoyproxy/envoy/issues/19749
  2. https://github.com/envoyproxy/envoy/issues/19750
  3. https://github.com/envoyproxy/envoy/issues/19751
  4. https://github.com/envoyproxy/envoy/issues/19752
  5. https://github.com/envoyproxy/envoy/issues/19753
  6. https://github.com/envoyproxy/envoy/issues/19754
  7. https://github.com/envoyproxy/envoy/issues/19755
  8. https://github.com/envoyproxy/envoy/issues/19757
yanavlasov commented 1 year ago

The work on this issue is still in progress and is pending on completion of #29388

huanghuangzym commented 2 months ago

if http header only have key, but no value . can 400 response? like: client (x-forward-protoc: "") -> envoy (response 404) add validate ?