varnishcache / varnish-cache

Varnish Cache source code repository
https://www.varnish-cache.org
Other
3.56k stars 365 forks source link

503 Invalid content-range header on partial response with start-end/* #4089

Open delthas opened 3 months ago

delthas commented 3 months ago

Context:

Expected behavior:

Current behavior:

After delving deeper, this happens in cache_http.c at http_GetContentRange: lo gets set to 10, hi to 19, cl to -1, as expected; but then if (*lo >= cl || *hi >= cl) return (-2); returns -2 and fails the request, because 10 >= -1.

https://github.com/varnishcache/varnish-cache/blob/8f2c3648b82a6e527ef86c1d6f228b94583b9e0a/bin/varnishd/cache/cache_http.c#L954

Not sure I fully understand the function, but should the condition be closer to if (*hi - *lo < cl) return (-2); ?

Here's a MWE VTC. Feel free to reuse.

varnishtest "Content-Range failure reproducer"

server s1 {
    rxreq
    txresp -status 206 -hdr "Transfer-Encoding: chunked" -hdr "content-range: bytes 10-19/*"
    chunkedlen 10
    chunkedlen 0
} -start

varnish v1 -vcl+backend {
    sub vcl_recv {
        return (pass);
    }
} -start

client c1 {
    txreq -hdr "range: bytes=10-19"
    rxresp
    expect resp.status != 503
} -run
dridi commented 3 months ago

Can you please give https://github.com/varnishcache/varnish-cache/pull/4091 a try?

delthas commented 3 months ago

Thanks for the quick fix.

Testing with #4091, the issue looks partially fixed (no more 503 Invalid content-range header); but I'm seeing a very strange behavior. Varnish is sometimes responding with 416 "Requested Range Not Satisfiable" (just after it logs Filters range). When I keep making the same request, it sometimes responds with 416 and sometimes with 206. I can't reproduce it in varnishtest (always 206).

My cURL (repeated many times): curl -v 'http://varnish/file.mp4' -H 'Range: bytes=52364-92712' -H 'Cache-Control: no-cache'

varnishlog (censored) for 206:

*   << BeReq    >> 427600
-   Begin          bereq 427599 pass
-   VCL_use        boot
-   Timestamp      Start: 1711550661.644483 0.000000 0.000000
-   BereqMethod    GET
-   BereqURL       /file.mp4
-   BereqProtocol  HTTP/1.1
-   BereqHeader    Host: varnish
-   BereqHeader    User-Agent: curl/8.6.0
-   BereqHeader    Accept: */*
-   BereqHeader    Range: bytes=52364-92712
-   BereqHeader    Cache-Control: no-cache
-   BereqHeader    X-Varnish: 427600
-   VCL_call       BACKEND_FETCH
-   BereqURL       /file.mp4
-   BereqUnset     Via: 1.1 varnish (Varnish/7.4)
-   BereqUnset     Host: varnish
-   VCL_return     fetch
-   Timestamp      Fetch: 1711550661.644500 0.000016 0.000016
-   BereqHeader    Host: source
-   Timestamp      Connected: 1711550661.644511 0.000028 0.000011
-   BackendOpen    32 source source 80 x 34718 reuse
-   Timestamp      Bereq: 1711550661.644561 0.000077 0.000049
-   BerespProtocol HTTP/1.1
-   BerespStatus   200
-   BerespReason   OK
-   BerespHeader   Date: Wed, 27 Mar 2024 14:44:21 GMT
-   BerespHeader   Accept-ranges: bytes
-   BerespHeader   Content-Range: bytes 52364-92712/*
-   BerespHeader   Transfer-Encoding: chunked
-   BerespHeader   Content-Type: video/mp4
-   Timestamp      Beresp: 1711550661.655290 0.010806 0.010728
-   VCL_call       BACKEND_RESPONSE
-   TTL            VCL 60 0 0 1711550662 uncacheable
-   VCL_return     deliver
-   Timestamp      Process: 1711550661.655341 0.010857 0.000051
-   Filters
-   Storage        malloc Transient
-   Fetch_Body     2 chunked stream
-   BackendClose   32 source recycle
-   Timestamp      BerespBody: 1711550661.655469 0.010985 0.000128
-   Length         40349
-   BereqAcct      444 0 444 310 40349 40659
-   End

*   << Request  >> 427599
-   Begin          req 427598 rxreq
-   Timestamp      Start: 1711550661.644366 0.000000 0.000000
-   Timestamp      Req: 1711550661.644366 0.000000 0.000000
-   VCL_use        boot
-   ReqStart       x 57136 http
-   ReqMethod      GET
-   ReqURL         /file.mp4
-   ReqProtocol    HTTP/1.1
-   ReqHeader      Host: varnish
-   ReqHeader      User-Agent: curl/8.6.0
-   ReqHeader      Accept: */*
-   ReqHeader      Range: bytes=52364-92712
-   ReqHeader      Cache-Control: no-cache
-   VCL_call       RECV
-   ReqURL         /file.mp4
-   VCL_return     pass
-   VCL_call       HASH
-   VCL_return     lookup
-   VCL_call       PASS
-   VCL_Log        action:miss
-   VCL_return     fetch
-   Link           bereq 427600 pass
-   Timestamp      Fetch: 1711550661.655399 0.011033 0.011033
-   RespProtocol   HTTP/1.1
-   RespStatus     200
-   RespReason     OK
-   RespHeader     Date: Wed, 27 Mar 2024 14:44:21 GMT
-   RespHeader     Accept-ranges: bytes
-   RespHeader     Content-Range: bytes 52364-92712/*
-   RespHeader     Content-Type: video/mp4
-   RespHeader     X-Varnish: 427599
-   RespHeader     Age: 0
-   VCL_call       DELIVER
-   RespUnset      X-Varnish: 427599
-   VCL_return     deliver
-   Timestamp      Process: 1711550661.655452 0.011086 0.000052
-   Filters         range
-   RespHeader     Content-Range: bytes 52364-92712/*
-   RespProtocol   HTTP/1.1
-   RespStatus     206
-   RespReason     Partial Content
-   RespHeader     Connection: keep-alive
-   RespHeader     Transfer-Encoding: chunked
-   Timestamp      Resp: 1711550661.655493 0.011126 0.000040
-   ReqAcct        246 0 246 446 0 446
-   End

varnishlog (censored) for 416:

*   << BeReq    >> 165189    
-   Begin          bereq 165188 pass
-   VCL_use        boot
-   Timestamp      Start: 1711550661.048146 0.000000 0.000000
-   BereqMethod    GET
-   BereqURL       /file.mp4
-   BereqProtocol  HTTP/1.1
-   BereqHeader    Host: varnish
-   BereqHeader    User-Agent: curl/8.6.0
-   BereqHeader    Accept: */*
-   BereqHeader    Range: bytes=52364-92712
-   BereqHeader    Cache-Control: no-cache
-   BereqHeader    X-Varnish: 165189
-   VCL_call       BACKEND_FETCH
-   BereqURL       /file.mp4
-   BereqUnset     Via: 1.1 varnish (Varnish/7.4)
-   BereqUnset     Host: varnish
-   VCL_return     fetch
-   Timestamp      Fetch: 1711550661.048198 0.000051 0.000051
-   BereqHeader    Host: source
-   Timestamp      Connected: 1711550661.048211 0.000065 0.000013
-   BackendOpen    32 source source 80 x 34718 reuse
-   Timestamp      Bereq: 1711550661.048265 0.000118 0.000053
-   BerespProtocol HTTP/1.1
-   BerespStatus   200
-   BerespReason   OK
-   BerespHeader   Date: Wed, 27 Mar 2024 14:44:21 GMT
-   BerespHeader   Accept-ranges: bytes
-   BerespHeader   Content-Range: bytes 52364-92712/*
-   BerespHeader   Transfer-Encoding: chunked
-   BerespHeader   Content-Type: video/mp4
-   Timestamp      Beresp: 1711550661.053141 0.004994 0.004876
-   VCL_call       BACKEND_RESPONSE
-   TTL            VCL 60 0 0 1711550661 uncacheable
-   VCL_return     deliver
-   Timestamp      Process: 1711550661.053237 0.005091 0.000096
-   Filters        
-   Storage        malloc Transient
-   Fetch_Body     2 chunked stream
-   BackendClose   32 source recycle
-   Timestamp      BerespBody: 1711550661.053336 0.005190 0.000098
-   Length         40349
-   BereqAcct      444 0 444 310 40349 40659
-   End            

*   << Request  >> 165188    
-   Begin          req 165187 rxreq
-   Timestamp      Start: 1711550661.047969 0.000000 0.000000
-   Timestamp      Req: 1711550661.047969 0.000000 0.000000
-   VCL_use        boot
-   ReqStart       x 57134 http
-   ReqMethod      GET
-   ReqURL         /file.mp4
-   ReqProtocol    HTTP/1.1
-   ReqHeader      Host: varnish
-   ReqHeader      User-Agent: curl/8.6.0
-   ReqHeader      Accept: */*
-   ReqHeader      Range: bytes=52364-92712
-   ReqHeader      Cache-Control: no-cache
-   VCL_call       RECV
-   ReqURL         /file.mp4
-   VCL_return     pass
-   VCL_call       HASH
-   VCL_return     lookup
-   VCL_call       PASS
-   VCL_Log        action:miss
-   VCL_return     fetch
-   Link           bereq 165189 pass
-   Timestamp      Fetch: 1711550661.053407 0.005437 0.005437
-   RespProtocol   HTTP/1.1
-   RespStatus     200
-   RespReason     OK
-   RespHeader     Date: Wed, 27 Mar 2024 14:44:21 GMT
-   RespHeader     Accept-ranges: bytes
-   RespHeader     Content-Range: bytes 52364-92712/*
-   RespHeader     Content-Type: video/mp4
-   RespHeader     X-Varnish: 165188
-   RespHeader     Age: 0
-   VCL_call       DELIVER
-   RespUnset      X-Varnish: 165188
-   VCL_return     deliver
-   Timestamp      Process: 1711550661.053432 0.005462 0.000024
-   Filters         range
-   RespHeader     Content-Range: bytes */40349
-   RespProtocol   HTTP/1.1
-   RespStatus     416
-   RespReason     Requested Range Not Satisfiable
-   RespHeader     Content-Length: 0
-   RespHeader     Connection: keep-alive
-   Timestamp      Resp: 1711550661.053485 0.005515 0.000052
-   ReqAcct        246 0 246 447 0 447
-   End            

The varnishlogs are exactly the same (except for stamps and ids), except after the range filter, where in one case Varnish responds with 206, and in the other case it replies with 416. Repeating the cURL varies between the two responses, seemingly randomly.

Edit: Managed to enable the Debug logs and got RANGE_FAIL low range beyond object on 416.

This happens in http_GetRange:

https://github.com/varnishcache/varnish-cache/blob/master/bin/varnishd/cache/cache_http.c#L1010-L1024

    if (len <= 0)
        return (NULL);          // Allow 200 response

    if (*lo < 0) {
        assert(*hi > 0);
        *lo = len - *hi;
        if (*lo < 0)
            *lo = 0;
        *hi = len - 1;
    } else if (len >= 0 && (*hi >= len || *hi < 0)) {
        *hi = len - 1;
    }

    if (*lo >= len)
        return ("low range beyond object");

In both cases, lo = 33318; hi=16769 (at the time of return). However in the 206 case, len = -1, so this returns early, and in the 416 case, len = 16770.

Looks like some race condition?

Edit: Looks like some race condition because adding a sleep in vcl_deliver makes it 406 almost 100% instead of 10%:

    sub vcl_deliver {
        C{
            usleep(100000);
        }C
        return (deliver);
    }

Sadly I still cannot reproduce this in varnishtest.

nigoroll commented 1 month ago

@dridi I would think that, with streaming enabled, we would need to add ObjWaitExtend(..., lo, ...).

dridi commented 1 month ago

This looks harder than I expected given how #4091 is structured.