DeepLcom / deepl-api-issues

Issue tracking repository for the DeepL API.
MIT License
0 stars 0 forks source link

API performs internal redirect instead of proper processing when receiving HOST header #20

Closed jblossey closed 1 month ago

jblossey commented 1 month ago

Summary:

Your api reacts to client-side set host headers by redirecting somewhere internally and then returning html code. A potential security risk, I think, because it seems like undefined behavior.

Use-Case:

While developing an app that uses your api, I had to setup an api request proxy through a traefik router that proxies from our client to your api. The vanilla setup looks like this:

# static traefik config "traefik.yml"
entryPoints:
  web:
    address: :8080
    http:
      redirections:
        entryPoint:
          to: websecure
          scheme: https
  websecure:
    address: :4443

log:
  level: DEBUG

accessLog:
  filePath: "/var/log/traefik/access-log.json"
  format: json
  bufferingSize: 100  # Optional: Adjust buffering if necessary
  fields:
    defaultMode: "keep"
    names:
      StartUTC: "keep"
      RequestMethod: "keep"
      RequestPath: "keep"
      RequestProtocol: "keep"
      DownstreamStatus: "keep"
      DownstreamStatusLine: "keep"
      Duration: "keep"
      OriginDuration: "keep"
      Overhead: "keep"
      RequestHeaders: "keep"
      ResponseHeaders: "keep"
      ServiceURL: "keep"
      ClientAddr: "keep"
      RequestAddr: "keep"
      OriginStatusLine: "keep"
      TLSVersion: "keep"
      TLSCipher: "keep"
      TLSClientSubject: "keep"
    headers:
      defaultMode: "keep"
      names:
        User-Agent: "keep"
        Authorization: "redact"  # Redact sensitive headers
        Content-Type: "keep"

providers:
  file:
    filename: /etc/traefik/dynamic.yml
#    watch: true
  docker:
    endpoint: "unix:///var/run/docker.sock"

serversTransport:
  insecureSkipVerify: true
# dynamic vanilla config "dynamic.yml"
tls:
  certificates:
    - certFile: /etc/ssl/cert.pem
      keyFile: /etc/ssl/key.pem
  options:
    default:
      minVersion: VersionTLS12
      cipherSuites:
        - "TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256"
        - "TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384"
        - "TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305"
      sniStrict: true

http:
  routers:
    to-deepl:
      rule: Host(`localhost`)
      service: deepl
      middlewares:
      - add-deepl-headers
      entryPoints:
      - websecure
      tls: {}

  middlewares:
    add-deepl-headers:
      headers:
        customRequestHeaders:
          Authorization: DeepL-Auth-Key <REDACTED>
          Content-Type: "application/json"
          Accept: "application/json"

  services:
    deepl:
      loadBalancer:
        servers:
        - url: "https://api.deepl.com/v2/translate"

You can start this config locally with:

// create self signed certificates
openssl req -x509 -newkey rsa:4096 -keyout key.pem -out cert.pem -days 365 -nodes -subj '/CN=localhost' && \
// start the traefik router
docker run --network host \
  -v ./traefik.yml:/etc/traefik/traefik.yml \
  -v ./dynamic.yml:/etc/traefik/dynamic.yml \
  -v /var/run/docker.sock:/var/run/docker.sock \
  -v ./logs:/var/log/traefik \
  -v $(pwd)/cert.pem:/etc/ssl/cert.pem \
  -v $(pwd)/key.pem:/etc/ssl/key.pem \
  traefik:v3.0.0

And request using cURL like this:

curl -X POST --location -v -k 'localhost:8080' \
--header 'Content-Type: application/json' \
--data '{
    "text": [
        "Hello, world!"
    ],
    "target_lang": "DE"
}'

Expected Behaviour:

The deepl api should not care about a host header when it receives it from a client and properly perform the requested translation.

Actual behaviour:

Because traefik stores the original host in the respective header field, the deepl API returns a 503 with the following html:

<!DOCTYPE html><html lang="en"><head><meta charset="utf-8"/><title>Page Load Error</title><style>
.button{background-color:#0f2b46;border:none;color:#fff;padding:15px;text-align:center;text-decoration:none;display:inline-block;font-size:16px;margin:4px 2px;cursor:pointer;border-radius:8px;max-width:100px}div.content{margin:center;width:500px}button,h1,object,p{text-align:center;font-family:-apple-system,BlinkMacSystemFont,Segoe UI,Roboto,Helvetica,Arial,sans-serif!important}.centering{display:flex;justify-content:center;align-items: center;flex-direction: column}#robot{max-width:200px}#reload-button{padding:2em}
</style><link rel="icon" type="image/png" href="https://static.deepl.com/img/favicon/favicon_16.png" sizes="16x16" /><link rel="icon" type="image/png" href="https://static.deepl.com/img/favicon/favicon_32.png" sizes="32x32" /><link rel="icon" type="image/png" href="https://static.deepl.com/img/favicon/favicon_96.png" sizes="96x96" /></head>
<body><main><div class=centering><img width="100px" src="https://static.deepl.com/img/logo/DeepL_Logo_darkBlue_v2.svg" />

<h1>We're sorry!</h1>
<div class=centering><img id="robot" src="https://static.deepl.com/img/404/robot.svg"/></div>
<p>Something went wrong. We're working on it 👷</p>
<p>Try reloading this page or come back later.</p>
<div class=centering id=reload-button><button onClick="location.reload();" class="button">Reload</button></div>
</main></body></html>

Suggested Fix (server-side/deepl-api-side):

Don't use the client-side transmitted HOST header to do something internally in your API. I would rate that as a potential security risk. Store the client headers when you receive the request the first time and set your own headers properly. Otherwise one of your servers apparently gets confused.

Suggested Fix (client side):

For anyone stumbling across the same problem, just tell traefik to not pass the host header with a simple config change:

  services:
    deepl:
      loadBalancer:
        passHostHeader: false # <-- this little bitch... incredibly important because DeepL API runs into undefined behavior if a host header is passed to it.
        servers:
        - url: "https://api.deepl.com/v2/translate"
JanEbbing commented 1 month ago

Hi, thanks for your report, but I will close here.

The deepl api should not care about a host header when it receives it from a client and properly perform the requested translation.

  1. We cannot ignore the Host header. See for example here or here:

The "Host: hostname" header value distinguishes between various DNS names sharing a single IP address, allowing name-based virtual hosting. While optional in HTTP/1.0, it is mandatory in HTTP/1.1.

You can check other Websites/APIs (e.g. Google), they will not ignore the Host field either.

  1. The error you observe is intentional and not "redirecting from somewhere internally" and hence not a security risk. As we take security seriously, I specifically checked this with the responsible team.