reyk / relayd

OpenBSD relayd daemon -experimental
Other
77 stars 11 forks source link

Relayd seems to strip away content from http payload. #12

Open noqqe opened 8 years ago

noqqe commented 8 years ago

Hi,

I just setup a small reverse proxy-like configuration for some of my python (werkzeug) instances.

# Macros
#
ext="0.0.0.0"

table <local1> { 127.0.0.1 }
table <local2> { 127.0.0.1 }
table <local3> { 127.0.0.1 }
table <local4> { 127.0.0.1 }
table <local5> { 127.0.0.1 }

http protocol httpPara {

  # tcp {nodelay, sack, socket buffer 65535, backlog 128}
  return error

  match request path "/sfw/**" forward to <local1>
  match request path "/nsfw/**" forward to <local2>
  match request path "/kadsen/**" forward to <local3>
  match request path "/pr0/**" forward to <local4>
  match request path "/demo/**" forward to <local5>

  pass
}

relay nichtparasoup {
  listen on $ext port 80
  protocol "httpPara"
  forward to <local1> port 5000 check http '/sfw/status' code 200
  forward to <local2> port 5002 check http '/nsfw/status' code 200
  forward to <local3> port 5003 check http '/kadsen/status' code 200
  forward to <local4> port 5004 check http '/pr0/status' code 200
  forward to <local5> port 5006 check http '/demo/status' code 200
}

The problem is that relayd seems to cut HTML content away after X Bytes. Not sure where this comes from. In the browser I see uninterpreted HTML that ends with this:

<dt>next image</dt>
<dd class="button">k</dd><dt>next-to-last image<

Where it should print:

<dt>next image</dt>
<dd class="button">k</dd><dt>next-to-last image</dt></dl><span class="button" id="keys">hot keys</span></footer></body>
</html>

The curling the instance directly (on the backend port, not using relayd) shows:

curl -v --silent  http://foo.example.org:5003/kadsen/ >/dev/null
*   Trying 11.11.11.11...
* Connected to foo.example.org (11.11.11.11) port 5003 (#0)
> GET /kadsen/ HTTP/1.1
> Host: foo.example.org:5003
> User-Agent: curl/7.47.0
> Accept: */*
>
* HTTP 1.0, assume close after body
< HTTP/1.0 200 OK
< Content-Type: text/html; charset=utf-8
< Content-Length: 57338
< Server: Werkzeug/0.11.10 Python/2.7.11
< Date: Thu, 18 Aug 2016 11:53:48 GMT
<
{ [14338 bytes data]
* Closing connection 0

And the same curl on relayd (notice the bytes transferred)

$ curl -v --silent  http://foo.example.org/kadsen/ >/dev/null
*   Trying 11.11.11.11...
* Connected to foo.example.org (11.11.11.11) port 80 (#0)
> GET /kadsen/ HTTP/1.1
> Host: foo.example.org
> User-Agent: curl/7.47.0
> Accept: */*
>
{ [40 bytes data]
* Connection #0 to host foo.example.org left intact

Anyways, I dont get delivered a working website using relayd and the http protocol. Not sure what to do now. I took a look at the tcp socket buffer configuration but that does not seem the right place, or?

noqqe commented 8 years ago

Okay, I found some more interesting stuff. If you closely look at the following two requests, it looks that the result that is coming from Relayd is not really a HTTP Answer. More looks more like its actually just the payload in plaintext including the headers.

What it should look like (query without relayd)

curl -v localhost:5003/kadsen/status ; echo
*   Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 5003 (#0)
> GET /kadsen/status HTTP/1.1
> Host: localhost:5003
> User-Agent: curl/7.47.0
> Accept: */*
>
* HTTP 1.0, assume close after body
< HTTP/1.0 200 OK
< Content-Type: text/plain; charset=utf-8
< Content-Length: 268
< Server: Werkzeug/0.11.10 Python/2.7.11
< Date: Fri, 19 Aug 2016 07:29:35 GMT
<
images cached: 56 (816 bytes) - already crawled: 75 (776 bytes)
         Reddit - aww             with factor  1.0: 22 Images |###*
         Reddit - cats            with factor  1.0: 18 Images |###
         Reddit - aww_gifs        with factor  1.0: 16 Images |###
* Closing connection 0

What relayd prints out:

$ curl -v localhost/kadsen/status ; echo
*   Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 80 (#0)
> GET /kadsen/status HTTP/1.1
> Host: localhost
> User-Agent: curl/7.47.0
> Accept: */*
>
Server: Werkzeug/0.11.10 Python/2.7.11
Date: Fri, 19 Aug 2016 07:29:11 GMT

images cached: 56 (816 bytes) - already crawled: 75 (776 bytes)
         Reddit - aww             with factor  1.0: 22 Images |###*
         Reddit - cats            with factor  1.0: 18

Note the missing HTTP 200 Return Code, note the missing < when the result is being recieved.

hrkfdn commented 6 years ago

I might have the same problem with monit, see here: https://marc.info/?l=openbsd-misc&m=152388670603181&w=2

Did you find any solution for this? I don't get this when relaying to httpd.

noqqe commented 6 years ago

@hrkfdn Did not find out anything. Just dropped the relayd solution and replaced it with nginx.