louketo / louketo-proxy

A OpenID / Proxy service
Apache License 2.0
950 stars 340 forks source link

Gatekeeper prevents streaming output #645

Open ghost opened 4 years ago

ghost commented 4 years ago

Gatekeeper prevents streaming output

Summary

I have a PHP based website that has a feature that sends already partly output to the requester. Below you can find a simple php example for this logic.

<?php
echo "starting script ... you should see output every 1 second<br>\n";

for ($i=1; $i<=50; $i++) {
    echo "index: $i; ob_level: " . ob_get_level() . "; ob_length: ". ob_get_length() . "<br>\n";

    if ($i % 10 === 0) {
        flush();
        ob_flush();
        sleep(1);
    }
}     

When using Gatekeeper before this website with this feature, only the complete output is displayed at the requester and not already the parts of the output.

Environment

Expected Results

I would like to have a way to configure gatekeeper to return already parts of the output of a website instead of the complete output.

Actual Results

see summary

Steps to reproduce

Additional Information

During my research, I have found out that there is a similar issue related to Nginx, were the internal buffer from Nginx is preventing this kind of logic to work. Maybe it's a similar reason here as well.

Beanow commented 3 years ago

Had a similar, related issue with "streaming" requests.

Use case

Using MJPEG-streamer to expose a webcam.

The way this "stream" is implemented (http handler here), is by sending a Content-Type: multipart/x-mixed-replace;boundary=... Multipart response body, and sending an infinite amount of Content-Type: image/jpeg parts as frames, as long as the connection is open. The upstream server flushes response body data whenever it can buffer a frame.

This particular software exposes such a stream on GET /?action=stream.

Expected result

Louketo is able to proxy this request as an unbuffered, infinite response body.

Actual result

While buffering is not an obvious issue here, the stream is closed after 10 seconds.

A workaround for this would be to set --server-write-timeout=0s, to disable the timeout.

Reproducing

Requirements

Start up this compose file with docker-compose up.

# docker-compose.yml
version: '3.7'
services:
  oidc-gate:
    image: quay.io/louketo/louketo-proxy
    command: >-
      --server-write-timeout=0s
      --upstream-url=http://webcam:80
      --listen=:3000
      --enable-default-deny=true
      --discovery-url=https://example-keycloak/auth/realms/local-testing
      --client-id=local-app
      --client-secret=12345678-1234-1234-1234-123456789012
      --encryption-key=AgXa7xRcoClDEU0ZDSH4X0XhL5Qy2Z2j
    ports:
      - 3000:3000

  webcam:
    image: sixsq/mjpg-streamer
    devices:
      # Streams from a V4L2 camera. Like a laptop/usb webcam.
      - /dev/video0
    ports:
      # Runs on :80 internally,
      # we expose 8080 for testing without auth.
      - 8080:80
  1. GET http://localhost:8080/?action=stream should serve the upstream server's MJPEG stream correctly.
  2. GET http://localhost:3000/?action=stream should first require authentication, then stream indefinitely.
  3. Change the --server-write-timeout=0s option to --server-write-timeout=3s and docker-compose up the changes.
  4. GET http://localhost:3000/?action=stream will end the stream after 3 seconds.

Additional Information

Other reverse proxy setups, do not require changing such a timeout in the first place. For example I've also proxied this through Caddy. There the reverse_proxy directive needs an option flush_interval -1.

The proxy buffers responses by default for wire efficiency:

  • flush_interval is a duration value that defines how often Caddy should flush the buffered response body to the client. Set to -1 to disable buffering.

This is closer to the problem @pahrens is observing, because Caddy will disable buffering of the response this way. Setting this to -1 also avoids any timeout issues. Caddy will just happily send an infinite response body.

Which makes a lot of sense, because no buffers means no latency, no congestion (for the proxy), no memory hogging... It becomes a problem of the upstream and client to set sane timeouts.

Beanow commented 3 years ago

I've poked a little bit at the provided PHP example.

Using a similar repro as I shared for my use-case:

# docker-compose.yml
version: '3.7'
services:
  oidc-gate:
    image: quay.io/louketo/louketo-proxy
    command: >-
      --server-write-timeout=0s
      --upstream-url=http://php:80
      --listen=:3000
      --enable-default-deny=true
      --discovery-url=https://example-keycloak/auth/realms/local-testing
      --client-id=local-app
      --client-secret=12345678-1234-1234-1234-123456789012
      --encryption-key=AgXa7xRcoClDEU0ZDSH4X0XhL5Qy2Z2j
    ports:
      - 3000:3000

  php:
    image: php:apache
    volumes:
      - ./stream-example.php:/var/www/html/index.php
    ports:
      - 8080:80

Used tech

The way PHP streams the response here is using Transfer-Encoding: chunked. PHP will handle the encoding of this for you through the flush functions.

Additionally, my PHP response included Content-Encoding: gzip and is reported as 596 B, compressed.

Observations

The proxied response also reports Transfer-Encoding: chunked, but no gzip. The reported size is 2.20 KB. So it would appear the proxy has decompressed for us.

Using a curl --raw request with the needed auth cookies shows it also doesn't have the original chunks. Instead I got 3 chunks: 800 bytes, 4 bytes, 0 bytes.

800
starting script ... you should see output every 1 second<br>
index: 1; ob_level: 0; ob_length: <br>
index: 2; ob_level: 0; ob_length: <br>
index: 3; ob_level: 0; ob_length: <br>
index: 4; ob_level: 0; ob_length: <br>
index: 5; ob_level: 0; ob_length: <br>
index: 6; ob_level: 0; ob_length: <br>
index: 7; ob_level: 0; ob_length: <br>
index: 8; ob_level: 0; ob_length: <br>
index: 9; ob_level: 0; ob_length: <br>
index: 10; ob_level: 0; ob_length: <br>
index: 11; ob_level: 0; ob_length: <br>
index: 12; ob_level: 0; ob_length: <br>
index: 13; ob_level: 0; ob_length: <br>
index: 14; ob_level: 0; ob_length: <br>
index: 15; ob_level: 0; ob_length: <br>
index: 16; ob_level: 0; ob_length: <br>
index: 17; ob_level: 0; ob_length: <br>
index: 18; ob_level: 0; ob_length: <br>
index: 19; ob_level: 0; ob_length: <br>
index: 20; ob_level: 0; ob_length: <br>
index: 21; ob_level: 0; ob_length: <br>
index: 22; ob_level: 0; ob_length: <br>
index: 23; ob_level: 0; ob_length: <br>
index: 24; ob_level: 0; ob_length: <br>
index: 25; ob_level: 0; ob_length: <br>
index: 26; ob_level: 0; ob_length: <br>
index: 27; ob_level: 0; ob_length: <br>
index: 28; ob_level: 0; ob_length: <br>
index: 29; ob_level: 0; ob_length: <br>
index: 30; ob_level: 0; ob_length: <br>
index: 31; ob_level: 0; ob_length: <br>
index: 32; ob_level: 0; ob_length: <br>
index: 33; ob_level: 0; ob_length: <br>
index: 34; ob_level: 0; ob_length: <br>
index: 35; ob_level: 0; ob_length: <br>
index: 36; ob_level: 0; ob_length: <br>
index: 37; ob_level: 0; ob_length: <br>
index: 38; ob_level: 0; ob_length: <br>
index: 39; ob_level: 0; ob_length: <br>
index: 40; ob_level: 0; ob_length: <br>
index: 41; ob_level: 0; ob_length: <br>
index: 42; ob_level: 0; ob_length: <br>
index: 43; ob_level: 0; ob_length: <br>
index: 44; ob_level: 0; ob_length: <br>
index: 45; ob_level: 0; ob_length: <br>
index: 46; ob_level: 0; ob_length: <br>
index: 47; ob_level: 0; ob_length: <br>
index: 48; ob_level: 0; ob_length: <br>
index: 49; ob_level: 0; ob_length: <br>
index: 50; ob_level: 0; ob_length: <
4
br>

0

Now,

Transfer-Encoding is a hop-by-hop header, that is applied to a message between two nodes, not to a resource itself. Each segment of a multi-node connection can use different Transfer-Encoding values. If you want to compress data over the whole connection, use the end-to-end Content-Encoding header instead.

This suggests, removing the chunks and buffering, is perfectly within spec. However decompressing Content-Encoding: gzip and removing the header is not allowed according to spec. Relates to: https://github.com/louketo/louketo-proxy/issues/642

Beanow commented 3 years ago

Digging into this more, I found the issue.

Cause, default flushing behaviour

The proxy dependency goproxy does not control how flushing is done. So it defaults to what the go standard library implements for both the upstream and downstream clients.

https://github.com/elazarl/goproxy/blob/0581fc3aee2d07555835bed1a876aca196a4a511/proxy.go#L180 The io.Copy of the body here, will copy the data as soon as it's available (each chunk PHP flushes), but the Go default is to flush to downstream every certain amount of bytes. Regardless of how long it takes to fill up that buffer.

Flushing manually

To flush sooner, the http.ResponseWriter may also implement http.Flusher and we can explicitly call .Flush(). See https://stackoverflow.com/a/30603654

That would flush whatever we have in the buffer so far, which may result in different sizes of chunks than PHP originally gave us. (Though that's acceptable in HTTP spec).

Adopting flush_interval from Caddy

Caddy is also written in Go, so we can compare implementations. Rather than io.Copy they use this https://github.com/caddyserver/caddy/blob/e385be922569c07a0471a6798d4aeaf972facb5b/modules/caddyhttp/reverseproxy/streaming.go#L126

Which may be an interesting addition to goproxy. But I also managed to modify and implement this as middleware when directly using goproxy. By wrapping the response writer and implementing io.ReaderFrom, io.Copy will use our implementation, which can be based on Caddy's flushing rules.

I'll have a go at porting that to louketo next.

Beanow commented 3 years ago

I got some of the way there with my WIP, feel free to check that out and use it. But given #683's sunsetting of the project I won't put in the work to make a PR out of it.