darkweak / souin

An HTTP cache system, RFC compliant, compatible with @tyktechnologies, @traefik, @caddyserver, @go-chi, @bnkamalesh, @beego, @devfeel, @labstack, @gofiber, @go-goyave, @go-kratos, @gin-gonic, @roadrunner-server, @zalando, @zeromicro, @nginx and @apache
https://docs.souin.io
MIT License
712 stars 56 forks source link

[bug] blank page cached on context canceled #337

Closed frederichoule closed 2 months ago

frederichoule commented 1 year ago

I have a recurring error where the homepage of the website is blank.

When I check the key value in redis, it looks like this:

HTTP/0.0 200 OK
Date: Fri, 21 Apr 2023 18:44:19 GMT
X-Souin-Stored-Ttl: 24h0m0s
Content-Length: 0

What strikes to me is the key name, which is GET-https-www.website.com-/, which all the other keys have {-VARY-} appended to them. If I delete the key, and refresh the website, the new key that appears is GET-https-www.website.com-/{-VARY-}Accept-Encoding:gzip, deflate, br, and now the page isn't blank anymore.

What would cause that?

rzv-me commented 9 months ago

I have been struggling with a similar problem as @frederichoule had: panic: Header called after Handler finished

The problem is that there is not check in CustomWriter.Header() if the request was cancelled, which is allowed in HTTP2.

I worked on a way to reproduce it, and I have this python script

import threading
import socket
from h2.connection import H2Connection
from h2.events import RequestReceived, StreamReset
from h2.config import H2Configuration
import ssl
from h2.errors import ErrorCodes

def root_function(url='localhost'):
    while True:
        try:
            # Create a TCP connection
            sock = socket.create_connection((url, 4433))
            print(f"Create a TCP connection")
            # Wrap the socket for TLS
            ctx = ssl.create_default_context()
            ctx.check_hostname = False
            ctx.verify_mode = ssl.CERT_NONE
            ctx.set_alpn_protocols(['h2'])
            sock = ctx.wrap_socket(sock, server_hostname=url)
            print(f"Wrap socket for TLS")
            # Make sure we're using HTTP/2
            assert sock.selected_alpn_protocol() == 'h2'

            # Create HTTP/2 connection
            config = H2Configuration(client_side=True)
            conn = H2Connection(config=config)
            conn.initiate_connection()
            sock.sendall(conn.data_to_send())

            # # Create a new stream
            # stream_id = conn.get_next_available_stream_id()
            # conn.send_headers(
            #     stream_id,
            #     [(':method', 'GET'), (':authority', url), (':path', '/'), (':scheme', 'https')],
            # )
            # sock.sendall(conn.data_to_send())

            # Read some data
            while True:
                stream_id = conn.get_next_available_stream_id()
                conn.send_headers(
                    stream_id,
                    [(':method', 'GET'), (':authority', url), (':path', '/mycachepath'), (':scheme', 'https')],
                )
                sock.sendall(conn.data_to_send())
                #data = sock.recv(65535)
                #if not data:
                    #break
                conn.reset_stream(stream_id, error_code=ErrorCodes.CANCEL)
                events = conn.receive_data(data)
                for event in events:
                     if isinstance(event, RequestReceived):
                        # Cancel the stream with error code for CANCEL
                         conn.reset_stream(event.stream_id, error_code=ErrorCodes.CANCEL)
                     elif isinstance(event, StreamReset):
                        print(f"Stream {event.stream_id} cancelled.")

                sock.sendall(conn.data_to_send())
        except Exception as e:
            print(f"An error occurred: {e}")

# Create 50 threads running root_function
threads = []
for i in range(1):
    thread = threading.Thread(target=root_function)
    thread.start()
    threads.append(thread)
    print(f"Created thread {i}")

# Keep the main thread alive
for thread in threads:
    thread.join()%

The caddy file is

{
    debug
    order cache before reverse_proxy
    cache {
        log_level debug
    }
}
localhost:4433 {
    tls internal
    encode gzip
    reverse_proxy /mycachepath https://[YOURBACKENDHOST] {
        header_up Host [YOURBACKENDHOST]
        header_up User-Real-IP {remote_host}
        header_up User-Real-Proto {scheme}
        header_up X-Forwarded-For {remote_host}
        header_up X-Forwarded-Port {server_port}
        header_up X-Forwarded-Proto {scheme}
        header_up X-Forwarded-Host {host}
        health_timeout 5s
        # header_down -Cache-Control
    }
    @matchCachePath path /mycachepath
    header /mycachepath >Cache-Control "must-revalidate, no-cache, no-store, private"
    cache @matchCachePath {
        ttl 3s
        stale 3s
        # log_level debug
        mode bypass
        key {
            disable_body
            disable_host
            disable_method
        }
    }
}

The git diff for the fix is this:

diff --git a/pkg/middleware/writer.go b/pkg/middleware/writer.go
index a786c8f..c108c98 100644
--- a/pkg/middleware/writer.go
+++ b/pkg/middleware/writer.go
@@ -43,6 +43,10 @@ type CustomWriter struct {
 func (r *CustomWriter) Header() http.Header {
        r.mutex.Lock()
        defer r.mutex.Unlock()
+
+       if (r.Req.Context().Err() != nil) {
+               return http.Header{}
+       }
        if r.headersSent {
                return http.Header{}
        }

The problem is that now I get another error

fatal error: concurrent map writes

goroutine 216 [running]:
github.com/caddyserver/caddy/v2.(*Replacer).Set(...)
    github.com/caddyserver/caddy/v2@v2.7.6/replacer.go:68
github.com/caddyserver/caddy/v2/modules/caddyhttp.(*HTTPErrorConfig).WithError(0x106887c28, 0x140157a9700, {0x106887c28?, 0x107bdcee0?})
    github.com/caddyserver/caddy/v2@v2.7.6/modules/caddyhttp/server.go:656 +0x1a4
github.com/caddyserver/caddy/v2/modules/caddyhttp.(*Server).ServeHTTP(0x140006d2580, {0x10689a450, 0x1400008a8c0}, 0x14015826500)
    github.com/caddyserver/caddy/v2@v2.7.6/modules/caddyhttp/server.go:368 +0x1038
net/http.serverHandler.ServeHTTP({0x0?}, {0x10689a450?, 0x1400008a8c0?}, 0x104266a40?)
    net/http/server.go:2938 +0xbc
net/http.initALPNRequest.ServeHTTP({{0x1068a0230?, 0x140157a7f80?}, 0x140157d2000?, {0x140004f64b0?}}, {0x10689a450, 0x1400008a8c0}, 0x14015826500)
    net/http/server.go:3546 +0x1b4
golang.org/x/net/http2.(*serverConn).runHandler(0x107c5b4c0?, 0x0?, 0x0?, 0x1068789e8?)
    golang.org/x/net@v0.19.0/http2/server.go:2368 +0xc0
created by golang.org/x/net/http2.(*serverConn).scheduleHandler in goroutine 232
    golang.org/x/net@v0.19.0/http2/server.go:2303 +0x208

And it is always in replacer.go:68

Hope this helps you

darkweak commented 9 months ago

@rzv-me what is your running version? Did you try with the 5a88d84c26128d43e37c614c83c3d3e53da2afd1 commit?

xcaddy build --with github.com/darkweak/souin/plugins/caddy@5a88d84c26128d43e37c614c83c3d3e53da2afd1 --with github.com/darkweak/souin@5a88d84c26128d43e37c614c83c3d3e53da2afd1
rzv-me commented 9 months ago

Hi @darkweak,

I am building with the latest version from master, which seems to be the same as the commit hash you sent

Go version is 1.21.5 and caddy v2.7.6