kubernetes-sigs / gateway-api

Repository for the next iteration of composite service (e.g. Ingress) and load balancing APIs.
https://gateway-api.sigs.k8s.io
Apache License 2.0
1.71k stars 449 forks source link

Add redirect loop protection to HTTPRequestRedirect #1185

Closed rainest closed 2 weeks ago

rainest commented 2 years ago

What would you like to be added: I want the HTTPRequestRedirect to include language indicating how implementations handle redirects that result in loops, where the request URL received is equal to the request URL returned by the filter.

I propose that the spec be amended to include something like

If the result of applying an HTTPRequestRedirect would be a response whose Location header value matches the original request URL, implementations MUST NOT return the redirect request, and MUST instead proxy the request upstream.

Is this behavior desirable, and can implementations honor it easily?

Why this is needed: https://gateway-api.sigs.k8s.io/v1alpha2/references/spec/#gateway.networking.k8s.io/v1alpha2.HTTPRequestRedirectFilter defines an HTTPRoute Filter for issuing HTTP redirects in response that match a set of criteria. My strictest interpretation of the spec is that implementations must honor these absolutely: there are no matching criteria other than the HTTPRoute itself, and if a request matches that HTTPRoute, the implementation must issue the redirect.

Because the redirect filter applies to any matching request, you can easily configure HTTPRoutes that will result in a redirect loop: clients will send a request with some URL, receive a redirect response directing them to the same URL, send a new request to that URL, and receive another redirect to the same URL ad infinitum. For example, this HTTPRoute performs an HTTP->HTTPS redirect for any request to the redirect.example domain:

apiVersion: gateway.networking.k8s.io/v1alpha2
kind: HTTPRoute
metadata:
  name: http-filter-redirect
spec:
  hostnames:
    - redirect.example
  rules:
    - filters:
      - type: RequestRedirect
        requestRedirect:
          scheme: https
          statusCode: 301

With the naive implementation above, attaching this HTTPRoute to a Gateway with both HTTP and HTTPS Listeners would result in a loop: http://example.redirect/ will be redirected to https://example.redirect and https://example.redirect will also be redirected, back to itself, indefinitely.

AFAIK to avoid this you'd need to create separate HTTPRoutes, one with the redirect and one without, and either bind them to separate Gateways or use a combination of allowedRoutes configuration and namespaced backendRefs to bind them to only one Listener on a single Gateway. I may have overlooked something here, but I don't see another way to ensure they don't collide. I am not aware of any reason users would want to configure a redirect loop, since HTTP-compliant clients will never receive a useful response from a looping URL, so I believe the strict interpretation of the current spec only needlessly complicates configuration.

While HTTP->HTTPS redirects are probably the most common case where sensible configuration will result in a loop, this problem is generic to any redirect configuration: a regex path match that redirects /foo/* to /foo/example will also loop indefinitely, since /foo/example matches the expression. The proposed change avoids loops regardless of the redirect type.

rainest commented 2 years ago

From our (Kong) perspective, we are in favor of avoiding loops, and iffy on the implementation. We support HTTP->HTTPS redirects as a standard part of our routing configuration, and configuring these redirects on dual-stack routes does indeed proxy HTTPS requests upstream.

We do not have standard support for generic redirects, however, and the simplest means we have of implementing non-protocol redirects would be functionality that is not easily able to insert dynamic data from the request (to reuse parts of the original request not changed by the filter). I do not think implementing a proper generic and dynamic redirect functionality will be difficult for us however, it's just not something that's immediately available--we'd need to implement something new even without this change to the spec.

If I'm right that there's no way to avoid this other than separate Gateways or fun allowedRoutes tricks, I that we'd probably actually need an updated HTTPRequestRedirect API to make excludes explicit, something like:

requestRedirect:
  scheme: https
  statusCode: 301
  exclude:
    scheme: https
youngnick commented 2 years ago

I think that the "if the request is the same, the redirect does nothing" rule is a pretty good guideline, that can easily be extended if we add extra functionality in the future.

Additionally, in the case that we're making the redirect exclusions explicit, aren't they always going to be specifying a thing to change about the request (in that example, the scheme), and then a thing to exclude (the scheme again)? Why would you ever want to redirect to a new path, and then exclude the scheme, or redirect the scheme and exclude a path? I agree that in general we should be avoiding implicit config and preferring explicit, but I'm really struggling to see cases where the "if the request is the same, the redirect does nothing" rule doesn't do what we want, and isn't easier to write.

In the community call today, @bowei mentioned the use case of having a redirect also add a header (often used with a CDN). I'd contend that there are two ways we could address that use case using the "same request = no action" rule:

pleshakov commented 2 years ago

I wonder if the HTTPS redirect use case is a suitable example for the redirect loop problem.

For example, what if an HTTPRoute has many rules. This means the user will need to configure the redirect filter for every rule, right?:

apiVersion: gateway.networking.k8s.io/v1alpha2
kind: HTTPRoute
metadata:
  name: foo-route
spec:
  parentRefs:
  - name: example-gateway # binds both to an HTTP and HTTPS listeners
  hostnames:
  - "foo.example.com"
  rules:
  - matches:
    - path:
        type: PathPrefix
        value: /path1
    filters:
    - type: RequestRedirect
      requestRedirect:
        scheme: https
        statusCode: 301
    backendRefs:
    - name: svc-1
      port: 8080
  - matches:
    - path:
        type: PathPrefix
        value: /path2
    filters:
    - type: RequestRedirect
      requestRedirect:
        scheme: https
        statusCode: 301
    backendRefs:
    - name: svc-2
      port: 8080
  - matches:
    - path:
        type: PathPrefix
        value: /path3
    filters:
    - type: RequestRedirect
      requestRedirect:
        scheme: https
        statusCode: 301
    backendRefs:
    - name: svc-3
      port: 8080

This looks more verbose than having two HTTPRoutes (one that explicitly references an HTTP listener of example-gateway with a redirect for path / and the other that explicitly references an HTTPS listener of example-gateway with all the paths defined), especially when an HTTPRoute has many many paths.

AFAIK to avoid this you'd need to create separate HTTPRoutes, one with the redirect and one without, and either bind them to separate Gateways or use a combination of allowedRoutes configuration and namespaced backendRefs to bind them to only one Listener on a single Gateway.

I wonder why configuring allowedRoutes and namespaced backendRefs is necessary. Could we just choose the right listener in parentRef of the HTTPRoute?

rainest commented 2 years ago

Could we just choose the right listener in parentRef of the HTTPRoute?

Yes--reviewing again while less tired, I glazed over sectionName because it wasn't named listener, but that will allow selecting the desired Listener directly.

This means the user will need to configure the redirect filter for every rule, right?

Yes, that seems inherent to having the filters specified at the rule level. In the case where you have many paths you probably would want to bind the path-differentiated HTTPRoute to your HTTPS Listener and then have a separate HTTP listener HTTPRoute without the paths, only the hostname and a blanket any path rule (or a shared prefix) with the redirect filter.

We do still allow binding to both HTTP and HTTPS Listeners, so the question is whether you have to create that split configuration to avoid loops or if you have the option to create a single HTTPRoute if you don't have many matches and/or aren't aware of the need to split.

we're making the redirect exclusions explicit, aren't they always going to be specifying a thing to change about the request (in that example, the scheme), and then a thing to exclude (the scheme again)?

My expectation is yes: if we have explicit exemption configurations, almost all would use the proposed implicit config. It'd technically allow you to do something silly like exempt only some subset of requests from redirects, but I can't imagine why you would. With that in mind, if we have explicit configuration for this, it makes more sense to add a general avoidLoops: true/false.

We promise that we'll never add anything other than things that can fit in a URI (scheme, hostname, port, path). This would make sense and kind of fit, given that the 3xx series only allow the Location field

This is my expectation as well (and the current behavior): the redirect filter only lets you control the contents of Location since that's all that https://datatracker.ietf.org/doc/html/rfc7231#section-6.4.2 specifies. @bowei do you have examples of what you're referring to? Offhand I don't know of any de facto standard where a 3XX response indicates the original URL in Location but also includes some header instructing the client to change something other than the URL in subsequent requests. If that exists, but is uncommon, would it possibly make sense to use ExtensionRef pointing to something very similar to HTTPRequestRedirectFilter, but with configuration for the header injection and implementation logic to allow loops for that custom filter while avoiding them in the standard HTTPRequestRedirectFilter?

k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

mikemorris commented 1 year ago

/remove-lifecycle stale

k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

robscott commented 1 year ago

/remove-lifecycle stale

Xunzhuo commented 1 year ago

/assign

nak3 commented 11 months ago

Hello 👋 Is there any news about this?

HummingMind commented 8 months ago

What a mess. I just spent a full day tryign to figure out how to redirect http://.../something to https://.../something. There seems to be no way to do this with HTTPRoute without getting stuck in a redirect loop. Why would a request that is already HTTPS readirect again to HTTPS (if the host and the path are the same, but only the scheme needs to change)?

Makes no sense. 🤦🏻

robscott commented 8 months ago

@HummingMind that definitely sounds frustrating. Redirect protection is significantly more complicated than it sounds, but our docs should make it more clear that you currently need a separate route for the redirect logic (https://gateway-api.sigs.k8s.io/guides/http-redirect-rewrite/#redirects).

HummingMind commented 8 months ago

@robscott I've read that section in the documentation and it did not clarify much. I found an answer here: https://github.com/cilium/cilium/issues/28796 (will be testing it shortly), where they talk about specifying a sectionName in the parentRefs, to attach to a specific listener for a Gateway, instead of the all the listeners. This, in turn, will require a separate HTTPRoute.

dprotaso commented 5 months ago

Curious what folks think about potentially matching on the scheme so that you wouldn't need two HTTPRoutes for redirects eg.

apiVersion: gateway.networking.k8s.io/v1alpha2
kind: HTTPRoute
metadata:
  name: http-filter-redirect
spec:
  hostnames:
    - redirect.example
  rules:
    - matches:
      - scheme: http 
      filters:
      - type: RequestRedirect
        requestRedirect:
          scheme: https
          statusCode: 301

Here the new bit is

  - matches:
    - scheme: http 

Ideally we wouldn't need this but 🤷 - this seems explicity and not to difficult in my mind

dprotaso commented 5 months ago

Seems like there's an issue already for matching scheme via pseudo headers - https://github.com/kubernetes-sigs/gateway-api/issues/2455 - though that might only work for HTTP/2 ?

One thing I'm curious about though is it possible to spoof a scheme?

k8s-triage-robot commented 2 months ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot commented 1 month ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot commented 2 weeks ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-ci-robot commented 2 weeks ago

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to [this](https://github.com/kubernetes-sigs/gateway-api/issues/1185#issuecomment-2244095426): >The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. > >This bot triages issues according to the following rules: >- After 90d of inactivity, `lifecycle/stale` is applied >- After 30d of inactivity since `lifecycle/stale` was applied, `lifecycle/rotten` is applied >- After 30d of inactivity since `lifecycle/rotten` was applied, the issue is closed > >You can: >- Reopen this issue with `/reopen` >- Mark this issue as fresh with `/remove-lifecycle rotten` >- Offer to help out with [Issue Triage][1] > >Please send feedback to sig-contributor-experience at [kubernetes/community](https://github.com/kubernetes/community). > >/close not-planned > >[1]: https://www.kubernetes.dev/docs/guide/issue-triage/ Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.