haproxy / haproxy

HAProxy Load Balancer's development branch (mirror of git.haproxy.org)
https://git.haproxy.org/
Other
4.87k stars 786 forks source link

Haproxy return empty response from server when headers are modified for GET request #2526

Closed exander77 closed 5 months ago

exander77 commented 5 months ago

Detailed Description of the Problem

I have a handler for 404 errors:

core.register_action("report_404", { "http-res" }, function(txn)

When I do:

    if method == "HEAD" then
        txn.http:res_set_header("Content-Length", "0")
    end

Everything works fine and I get:

HTTP/1.1 404 Not Found
date: Thu, 11 Apr 2024 09:04:01 GMT
server: Apache/2.4.58 (Unix) OpenSSL/3.0.11
last-modified: Wed, 08 Mar 2017 13:14:23 GMT
etag: "37b5-54a37e8368ca0"
accept-ranges: bytes
cache-control: max-age=2592000, public
content-type: image/jpeg
content-length: 0

When I do:

    if method == "GET" then
        txn.http:res_set_header("Content-Length", "0")
    end

Then the server just closes the connection:

* Empty reply from server
* Closing connection
* TLSv1.3 (OUT), TLS alert, close notify (256):
curl: (52) Empty reply from server

No error is issued in haproxy.log.

Expected Behavior

Haproxy should return a response the same way it does for HEAD request.

Steps to Reproduce the Behavior

  1. Create a handler to call a function.
  2. Modify Content-Length: 0.
  3. Do server reqeust.

Do you have any idea what may have caused this?

No response

Do you have an idea how to solve the issue?

No response

What is your configuration?

Default.

Output of haproxy -vv

HAProxy version 2.4.22-0ubuntu0.22.04.3 2023/12/04 - https://haproxy.org/
Status: long-term supported branch - will stop receiving fixes around Q2 2026.
Known bugs: http://www.haproxy.org/bugs/bugs-2.4.22.html
Running on: Linux 5.15.0-86-generic #96-Ubuntu SMP Wed Sep 20 08:23:49 UTC 2023 x86_64
Build options :
  TARGET  = linux-glibc
  CPU     = generic
  CC      = cc
  CFLAGS  = -O2 -g -O2 -flto=auto -ffat-lto-objects -flto=auto -ffat-lto-objects -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -Wall -Wextra -Wdeclaration-after-statement -fwrapv -Wno-address-of-packed-member -Wno-unused-label -Wno-sign-compare -Wno-unused-parameter -Wno-clobbered -Wno-missing-field-initializers -Wno-cast-function-type -Wtype-limits -Wshift-negative-value -Wshift-overflow=2 -Wduplicated-cond -Wnull-dereference
  OPTIONS = USE_PCRE2=1 USE_PCRE2_JIT=1 USE_OPENSSL=1 USE_LUA=1 USE_SLZ=1 USE_SYSTEMD=1 USE_PROMEX=1
  DEBUG   = 

Feature list : -51DEGREES +ACCEPT4 +BACKTRACE -CLOSEFROM +CPU_AFFINITY +CRYPT_H -DEVICEATLAS +DL +EPOLL -EVPORTS +FUTEX +GETADDRINFO -KQUEUE +LIBCRYPT +LINUX_SPLICE +LINUX_TPROXY +LUA -MEMORY_PROFILING +NETFILTER +NS -OBSOLETE_LINKER +OPENSSL -OT -PCRE +PCRE2 +PCRE2_JIT -PCRE_JIT +POLL +PRCTL -PRIVATE_CACHE -PROCCTL +PROMEX -PTHREAD_PSHARED -QUIC +RT +SLZ -STATIC_PCRE -STATIC_PCRE2 +SYSTEMD +TFO +THREAD +THREAD_DUMP +TPROXY -WURFL -ZLIB

Default settings :
  bufsize = 16384, maxrewrite = 1024, maxpollevents = 200

Built with multi-threading support (MAX_THREADS=64, default=20).
Built with OpenSSL version : OpenSSL 3.0.2 15 Mar 2022
Running on OpenSSL version : OpenSSL 3.0.2 15 Mar 2022
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports : TLSv1.0 TLSv1.1 TLSv1.2 TLSv1.3
Built with Lua version : Lua 5.3.6
Built with the Prometheus exporter as a service
Built with network namespace support.
Built with libslz for stateless compression.
Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND
Built with PCRE2 version : 10.39 2021-10-29
PCRE2 library supports JIT : yes
Encrypted password support via crypt(3): yes
Built with gcc compiler version 11.4.0

Available polling systems :
      epoll : pref=300,  test result OK
       poll : pref=200,  test result OK
     select : pref=150,  test result OK
Total: 3 (3 usable), will use epoll.

Available multiplexer protocols :
(protocols marked as <default> cannot be specified using 'proto' keyword)
              h2 : mode=HTTP       side=FE|BE     mux=H2       flags=HTX|CLEAN_ABRT|HOL_RISK|NO_UPG
            fcgi : mode=HTTP       side=BE        mux=FCGI     flags=HTX|HOL_RISK|NO_UPG
              h1 : mode=HTTP       side=FE|BE     mux=H1       flags=HTX|NO_UPG
       <default> : mode=HTTP       side=FE|BE     mux=H1       flags=HTX
            none : mode=TCP        side=FE|BE     mux=PASS     flags=NO_UPG
       <default> : mode=TCP        side=FE|BE     mux=PASS     flags=

Available services : prometheus-exporter
Available filters :
    [SPOE] spoe
    [CACHE] cache
    [FCGI] fcgi-app
    [COMP] compression
    [TRACE] trace

Last Outputs and Backtraces

No response

Additional Information

No response

vkssv commented 5 months ago

Hi Alexander !

Thanks for details !

This behavior seems to be valid as you send GET, which contains "content-length: 0".

According to RFC 7230 HTTP Message Syntax and Routing: "A user agent SHOULD NOT send a Content-Length header field when the request message does not contain a payload body and the method semantics do not anticipate such a body." https://datatracker.ietf.org/doc/html/rfc7230#section-3.3.2

So, since you normally doesn't send any additional data when you do a GET request, the header Content-Length should not be set at all.

If you include a message-body, when doing a GET request, it is considered as a bad practice. RFC 7231 HTTP Semantics and Content states on this: "A payload within a GET request message has no defined semantics; sending a payload body on a GET request might cause some existing implementations to reject the request."

https://datatracker.ietf.org/doc/html/rfc7231#section-4.3.1

capflam commented 5 months ago

It would be easier with the whole script and the configuration. But it seems the response is received from the server.. In case of a response to a HEAD request, the server set the content-length but does not send the payload. It is valid and there is no issue here. But by forcing the content-length to 0, this also works because there is no payload. It is also valid. However, for a response to a GET request, the payload is sent. When you set the content-length to 0, the payload is not removed. This triggers a processing error in HAProxy when it tries to send the response to client because there is a payload while it should not.

On recent versions (>= 2.8), in this case you should have an error is logs. The termination state shoud be ID--. On older versions, the error is also reported but it is probably reported as CD--. In all cases, you cannot change the content-length by hand expecting HAproxy will remove the payload automatically.

exander77 commented 5 months ago

Hi Alexander !

Thanks for details !

This behavior seems to be valid as you send GET, which contains "content-length: 0".

According to RFC 7230 HTTP Message Syntax and Routing: "A user agent SHOULD NOT send a Content-Length header field when the request message does not contain a payload body and the method semantics do not anticipate such a body." https://datatracker.ietf.org/doc/html/rfc7230#section-3.3.2

So, since you normally doesn't send any additional data when you do a GET request, the header Content-Length should not be set at all.

I am not sure if it relates, as GET method semantics definitely anticipates body.

exander77 commented 5 months ago

It would be easier with the whole script and the configuration. But it seems the response is received from the server.. In case of a response to a HEAD request, the server set the content-length but does not send the payload. It is valid and there is no issue here. But by forcing the content-length to 0, this also works because there is no payload. It is also valid. However, for a response to a GET request, the payload is sent. When you set the content-length to 0, the payload is not removed. This triggers a processing error in HAProxy when it tries to send the response to client because there is a payload while it should not.

On recent versions (>= 2.8), in this case you should have an error is logs. The termination state shoud be ID--. On older versions, the error is also reported but it is probably reported as CD--. In all cases, you cannot change the content-length by hand expecting HAproxy will remove the payload automatically.

This sounds more like it. Maybe the issue is with not removing the payload?

What I am basically doing is that I am dropping body from GET response. And the common way to do it is to set Content-Length: 0 to do it. But maybe not in haproxy?

exander77 commented 5 months ago

Then my question really is, how to properly remove content body from response in haproxy.

exander77 commented 5 months ago

Whole code is pretty simple:

core.register_action("report_404", { "http-res" }, function(txn)
    local status = txn.sf:status()
    local method = txn.sf:method()
    local path = txn.get_var(txn, "txn.path")
    txn:Warning("404handler: " .. path)
    if method == "HEAD" and path:match("%.webp$") then
        txn:Warning("404handler body cleared: " .. path)
        txn.http:res_set_header("Content-Length", "0")
    end
    if method == "GET" and path:match("%.webp$") then
        txn:Warning("404handler body cleared: " .. path)
        txn.http:res_set_header("Content-Length", "0")
    end
end)
capflam commented 5 months ago

The issue here is indeed about the response with a payload and a content-length header updated to be 0. It is detected as an internal error by HAProxy. From a server it is a invalid response. But, here it is because of an internal processing, so it is detected as an internal error.

To remove the payload from a lua script, you should use a filter. It is not so easy but it may work. Unfortunately, lua filters are not available in 2.4. You should at least upgrade to 2.6 to do so. Another solution would be to catch the 404 error from the server and replace it by your own one via an http-response return action. It is probably easier this way. For instance:

http-response return status 404 if { status 404 }

The same may also be performed via a lua script. For instance:

function report_404(txn)
    local reply = txn:reply()
    reply:set_status(404, "Bad request")
    reply:add_header("content-length", "0")
    txn:done(reply)
end

core.register_action("report_404", { "http-res" }, report_404);
exander77 commented 5 months ago

local reply = txn:reply() reply:set_status(404, "Bad request") reply:add_header("content-length", "0") txn:done(reply)

This actually works splendidely:

core.register_action("report_404", { "http-res" }, function(txn)
    local status = txn.sf:status()
    local method = txn.sf:method()
    local path = txn.get_var(txn, "txn.path")
    txn:Warning("404handler: " .. path)
    if method == "HEAD" and path:match("%.webp$") then
        txn:Warning("404handler body cleared: " .. path)
        txn.http:res_set_header("Content-Length", "0")
    end
    if method == "GET" and path:match("%.webp$") then
        txn:Warning("404handler body cleared: " .. path)
        local reply = txn:reply()
        reply:set_status(404, "Not Found")
        txn:done(reply)
    end
end)
capflam commented 5 months ago

Ok, fine. I'm closing now.