ory / oathkeeper

A cloud native Identity & Access Proxy / API (IAP) and Access Control Decision API that authenticates, authorizes, and mutates incoming HTTP(s) requests. Inspired by the BeyondCorp / Zero Trust white paper. Written in Go.
https://www.ory.sh/?utm_source=github&utm_medium=banner&utm_campaign=hydra
Apache License 2.0
3.24k stars 357 forks source link

`strip_path` strips the prefix from the final upstream request, not the initial request #1085

Open syserr0r opened 1 year ago

syserr0r commented 1 year ago

Preflight checklist

Describe the bug

We have 1 backend serving 2 applications we wish to protect via different Oathkeeper rules.

Our rules look like:

- id: app1-api
  upstream:
    url: http://backend/api/app1/
    preserve_host: true
    strip_path: '/api/app1'
  match:
    url: https://app1.api.company.com/<.*>

- id: app2-api
  upstream:
    url: http://backend/api/app2/
    preserve_host: true
    strip_path: '/api/app2'
  match:
    url: https://app2.api.company.com/<.*>

The idea being a typical request for https://app1.api.company.com/foo would go to http://backend/api/app1/foo (which works fine without the strip_path: '/api/app1).

However, if the backend generates an absolute url in its response it would be https://app1.api.company.com/api/app1/foo. This means some reqeusts would repeat the /api/app1 prefix ending up with a backend request to http://backend/api/app1/api/app1/foo (which is what happens without the strip_path: '/api/app1).

I had expected the strip_path would save me here allowing /api/app1 to be stripped before forwarding the request to the backend. It appears it does, but it breaks normal usage without the extra prefix.

A request of https://app1.api.company.com/api/app1/foo does get to the backend at http://backend/api/app1/foo, however requests to https://api1.api.company/foo (without the extra prefix) incorrectly have the upstream prefix stripped.

It appears the strip_path is applying to the final request URL and not the original request.

I am not sure what is intended, the docs say:

If set, replaces the provided path prefix when forwarding the requested URL to the upstream URL

Which I read as the incoming request having the replacement, not the final request to the backend.

Reproducing the bug

  1. Run config below
  2. Make request to https://app1.api.company.com/api/app1/foo, this correctly hits http://backend/api/app1/foo
  3. Make request to https://app1.api.company.com/api/app1/foo, this incorrectly hits http://backend/foo

Relevant log output

time=2023-03-30T00:46:55Z level=info msg=started handling request http_request=map[headers:map[accept:*/* accept-encoding:gzip, deflate, br accept-language:en-GB,en;q=0.5 authorization:Value is sensitive and has been redacted. To see the value set config key "log.leak_sensitive_values = true" or environment variable "LOG_LEAK_SENSITIVE_VALUES=true". cookie:Value is sensitive and has been redacted. To see the value set config key "log.leak_sensitive_values = true" or environment variable "LOG_LEAK_SENSITIVE_VALUES=true". dnt:1 origin:https://testing.app1.dc.company.com referer:https://testing.app1.dc.company.com/ sec-fetch-dest:empty sec-fetch-mode:cors sec-fetch-site:same-site sec-gpc:1 user-agent:Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/111.0 x-forwarded-for:x.x.x.x x-forwarded-host:app1-testing.api.company.com x-forwarded-port:443 x-forwarded-proto:https x-forwarded-scheme:https x-real-ip:x.x.x.x x-request-id:1dc05f2f53b2ea36337fb76076f7824f x-scheme:https] host:app1-testing.api.company.com method:GET path:/v0.1/app2-users/update_details/ query:<nil> remote:10.42.121.254:49782 scheme:http]
time=2023-03-30T00:46:55Z level=info msg=Access request granted audience=application granted=true http_host=app1-testing.api.company.com http_method=GET http_url=http://app2-testing-service.app2.svc.cluster.local/v0.1/app2-users/update_details/ http_user_agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/111.0 service_name=ORY Oathkeeper service_version=v0.40.1 subject=d11999a8-6d23-4dda-abd1-517b2a055cc6
time=2023-03-30T00:46:55Z level=info msg=completed handling request http_request=map[headers:map[accept:*/* accept-encoding:gzip, deflate, br accept-language:en-GB,en;q=0.5 authorization:Value is sensitive and has been redacted. To see the value set config key "log.leak_sensitive_values = true" or environment variable "LOG_LEAK_SENSITIVE_VALUES=true". cookie:Value is sensitive and has been redacted. To see the value set config key "log.leak_sensitive_values = true" or environment variable "LOG_LEAK_SENSITIVE_VALUES=true". dnt:1 origin:https://testing.app1.dc.company.com referer:https://testing.app1.dc.company.com/ sec-fetch-dest:empty sec-fetch-mode:cors sec-fetch-site:same-site sec-gpc:1 user-agent:Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/111.0 x-forwarded-for:x.x.x.x x-forwarded-host:app1-testing.api.company.com x-forwarded-port:443 x-forwarded-proto:https x-forwarded-scheme:https x-real-ip:x.x.x.x x-request-id:1dc05f2f53b2ea36337fb76076f7824f x-scheme:https] host:app1-testing.api.company.com method:GET path:/v0.1/app2-users/update_details/ query:<nil> remote:10.42.121.254:49782 scheme:http] http_response=map[headers:map[access-control-allow-credentials:true access-control-allow-origin:https://testing.app1.dc.company.com content-length:3570 content-type:text/html date:Thu, 30 Mar 2023 00:46:55 GMT referrer-policy:same-origin server:uvicorn vary:Origin x-content-type-options:nosniff x-frame-options:DENY] size:3570 status:404 text_status:Not Found took:87.204709ms]

Relevant configuration

- id: app1-api
  upstream:
    url: http://backend/api/app1/
    preserve_host: true
    strip_path: '/api/app1'
  match:
    url: https://app1.api.company.com/<.*>
  authenticators:
    - handler: noop
  authorizer:
    handler: allow
  mutators:
    - handler: noop
  errors:
    - handler: json

- id: app2-api
  upstream:
    url: http://backend/api/app2/
    preserve_host: true
    strip_path: '/api/app2'
  match:
    url: https://app2.api.company.com/<.*>
  authenticators:
    - handler: noop
  authorizer:
    handler: allow
  mutators:
    - handler: noop
  errors:
    - handler: json

Version

helm 0.29 deploying oathkeeper 0.41.0

On which operating system are you observing this issue?

Linux

In which environment are you deploying?

Kubernetes with Helm

Additional Context

No response

syserr0r commented 1 year ago

I believe this patch would fix the problem (by applying the strip_path to just the incoming URL before merging with the upstream URL).

I am not familiar enough with golang to generate the tests otherwise I would submit a PR, although I can try if you would like?

diff --git a/proxy/proxy.go b/proxy/proxy.go
--- proxy/proxy.go
+++ proxy/proxy.go
@@ -176,14 +176,15 @@

    forwardURL := r.URL
    forwardURL.Scheme = backendScheme
    forwardURL.Host = backendHost
-   forwardURL.Path = "/" + strings.TrimLeft("/"+strings.Trim(backendPath, "/")+"/"+strings.TrimLeft(proxyPath, "/"), "/")

    if rl.Upstream.StripPath != "" {
-       forwardURL.Path = strings.Replace(forwardURL.Path, "/"+strings.Trim(rl.Upstream.StripPath, "/"), "", 1)
+       proxyPath = strings.Replace(proxyPath, "/"+strings.Trim(rl.Upstream.StripPath, "/"), "", 1)
    }

+   forwardURL.Path = "/" + strings.TrimLeft("/"+strings.Trim(backendPath, "/")+"/"+strings.TrimLeft(proxyPath, "/"), "/")
+
    r.Host = backendHost
    if rl.Upstream.PreserveHost {
        r.Host = proxyHost
    }
fenech commented 7 months ago

I also found this confusing - I wrote a test case to prove the fix. Should a PR be opened, or is the current behaviour intentional (the strip_path is applied to the upstream path)?