caddyserver / cache-handler

Distributed HTTP caching module for Caddy
Apache License 2.0
235 stars 17 forks source link

Incomplete responses from reverse_proxy (context cancelled) are being cached #73

Open dubo-dubon-duponey opened 4 months ago

dubo-dubon-duponey commented 4 months ago

Doing some tests to use caddy + cache handler to cache deb packages from snapshot.debian.org.

It looks like in some cases, the client connection is being dropped after 10sec, resulting in the upstream connection to be dropped as well, and somehow the fragmentary response is still being saved in cache as valid.

Detailed info:

caddy version
f11c3c9f5a1be082450d64369853e1dacda22dde+modified (17 Aug 23 17:34 UTC)
{
  admin off
  skip_install_trust
  order basicauth after request_header
  order replace after encode
  order cache before rewrite

  log {
    output stdout
    format json
    level debug
  }

  cache {
    log_level debug
    cache_keys {
    .* {
          disable_body
    }
      }

    key {
      disable_body
    }

    stale 31536000s
    ttl 31536000s

    nuts {
      configuration {
      Dir "/data/cache"
      EntryIdxMode 1
      RWMode 0
      SegmentSize 1024
      NodeNum 42
      SyncEnable true
      StartFileLoadingMode 1
      }
    }
  }
}

http://: {
  cache
  reverse_proxy http://snapshot.debian.org {
    header_up Host snapshot.debian.org:80
    header_up -X-Forwarded-*
    header_up -X-Real-IP
    header_down -Server
  }
}

Here is the relevant part of the logs:

{"level":"error","ts":1708923817.4212139,"logger":"http.log.error","msg":"context deadline exceeded","request":{"remote_ip":"10.0.42.13","remote_port":"55452","client_ip":"10.0.42.13","proto":"HTTP/1.1","method":"GET","host":"snapshot.debian.org","uri":"/archive/debian/20240220T000000Z/pool/main/g/gcc-12/gcc-12_12.2.0-14_arm64.deb","headers":{"User-Agent":["DuboDubonDuponey/1.0 (apt)"]}},"duration":10.004700584}
{"level":"debug","ts":1708923817.4218266,"logger":"http.handlers.reverse_proxy","msg":"upstream roundtrip","upstream":"snapshot.debian.org:80","duration":10.005120997,"request":{"remote_ip":"10.0.42.13","remote_port":"55452","client_ip":"10.0.42.13","proto":"HTTP/1.1","method":"GET","host":"snapshot.debian.org:80","uri":"/archive/debian/20240220T000000Z/pool/main/g/gcc-12/gcc-12_12.2.0-14_arm64.deb","headers":{"Date":["Mon, 26 Feb 2024 05:03:27 UTC"],"User-Agent":["DuboDubonDuponey/1.0 (apt)"]}},"error":"context canceled"}
{"level":"debug","ts":1708923840.918993,"logger":"http.handlers.cache","msg":"Incomming request &{Method:GET URL:/archive/debian/20240220T000000Z/pool/main/g/gcc-12/gcc-12_12.2.0-14_amd64.deb Proto:HTTP/1.1 ProtoMajor:1 ProtoMinor:1 Header:map[If-Range:[Tue, 10 Jan 2023 03:53:37 GMT] Range:[bytes=21-] User-Agent:[DuboDubonDuponey/1.0 (apt)]] Body:{} GetBody:<nil> ContentLength:0 TransferEncoding:[] Close:false Host:snapshot.debian.org Form:map[] PostForm:map[] MultipartForm:<nil> Trailer:map[] RemoteAddr:10.0.42.13:55454 RequestURI:/archive/debian/20240220T000000Z/pool/main/g/gcc-12/gcc-12_12.2.0-14_amd64.deb TLS:<nil> Cancel:<nil> Response:<nil> ctx:0xc012447080}"}
{"level":"debug","ts":1708923844.5852408,"logger":"http.handlers.reverse_proxy","msg":"upstream roundtrip","upstream":"snapshot.debian.org:80","duration":3.665729569,"request":{"remote_ip":"10.0.42.13","remote_port":"55454","client_ip":"10.0.42.13","proto":"HTTP/1.1","method":"GET","host":"snapshot.debian.org:80","uri":"/archive/debian/20240220T000000Z/pool/main/g/gcc-12/gcc-12_12.2.0-14_amd64.deb","headers":{"If-Range":["Tue, 10 Jan 2023 03:53:37 GMT"],"User-Agent":["DuboDubonDuponey/1.0 (apt)"],"Date":["Mon, 26 Feb 2024 05:04:00 UTC"],"Range":["bytes=21-"]}},"headers":{"Date":["Mon, 26 Feb 2024 05:04:02 GMT"],"Content-Range":["bytes 21-19266935/19266936"],"Permissions-Policy":["interest-cohort=()"],"X-Varnish":["544520629"],"Via":["1.1 varnish (Varnish/6.5)"],"Server":["Apache"],"X-Content-Type-Options":["nosniff"],"X-Xss-Protection":["1"],"Last-Modified":["Tue, 10 Jan 2023 03:53:37 GMT"],"Etag":["\"125fd78-5f1e0d427f099\""],"Cache-Control":["max-age=31536000, public"],"Age":["0"],"X-Frame-Options":["sameorigin"],"Referrer-Policy":["no-referrer"],"Accept-Ranges":["bytes"],"X-Clacks-Overhead":["GNU Terry Pratchett"],"Content-Length":["19266915"]},"status":206}
{"level":"debug","ts":1708924170.954446,"logger":"http.handlers.cache","msg":"Incomming request &{Method:GET URL:/archive/debian/20240220T000000Z/pool/main/g/gcc-12/gcc-12_12.2.0-14_amd64.deb Proto:HTTP/1.1 ProtoMajor:1 ProtoMinor:1 Header:map[User-Agent:[DuboDubonDuponey/1.0 (apt)]] Body:{} GetBody:<nil> ContentLength:0 TransferEncoding:[] Close:false Host:snapshot.debian.org Form:map[] PostForm:map[] MultipartForm:<nil> Trailer:map[] RemoteAddr:10.0.42.13:55486 RequestURI:/archive/debian/20240220T000000Z/pool/main/g/gcc-12/gcc-12_12.2.0-14_amd64.deb TLS:<nil> Cancel:<nil> Response:<nil> ctx:0xc000c78a50}"}
{"level":"debug","ts":1708924170.9549901,"logger":"http.handlers.cache","msg":"The stored key GET-http-snapshot.debian.org-/archive/debian/20240220T000000Z/pool/main/g/gcc-12/gcc-12_12.2.0-14_amd64.deb matched the current iteration key ETag &{Matched:true IfNoneMatchPresent:false IfMatchPresent:false IfModifiedSincePresent:false IfUnmodifiedSincePresent:false IfUnmotModifiedSincePresent:false NeedRevalidation:true NotModified:false IfModifiedSince:0001-01-01 00:00:00 +0000 UTC IfUnmodifiedSince:0001-01-01 00:00:00 +0000 UTC IfNoneMatch:[] IfMatch:[] RequestETags:[] ResponseETag:\"125fd78-5f1e0d427f099\"}"}

Hope I am not missing something obvious here.

On the client side, apt will reject the resource (since it does not match the expected hash) - that is how I noticed the issue. Clearly, the resource has been inserted in the cache, but it should have not.

Any insight here?

Thanks in advance.