slimhazard commented 5 months ago

Expected Behavior

What's New for Varnish 7.3 says of ESI includes:

... only objects with 200 and 204 status will be included and any other status code will fail the parent ESI request.

If objects with other status should be delivered, they should have their status changed to 200 in VCL, for instance in sub vcl_backend_error{}, vcl_synth{} or vcl_deliver{}.

So we expected that this snippet in vcl_deliver{} would prevent failure of the including parent in all cases:

sub vcl_deliver {
    if (req.esi_level > 0 && resp.status != 200 && resp.status != 204) {
        set resp.status = 200;
    }
}

Current Behavior

Even with the VCL snippet shown above, the parent ESI can evidently fail under certain circumstances, in fact apparently in default configuration. I'll add a VTC test case, based on Varnish's current e00035.vtc, to illustrate that.

The VTC suggests that beresp.do_stream may be related to the problem. If beresp.do_stream is set to false and the VCL snippet is included in vcl_deliver, then as expected, the parent response does not fail, and the client response body is not truncated. If the beresp.do_stream == true, which is the default, then the parent response fails and the body is truncated, even with the vcl_deliver snippet.

That would at least make sense to me, since it's familiar that when streaming is on, a client response status may be seen as 200, even when the fetch actually failed (for example when there were too few bytes for the value in Content-Length), and would have been set to 503 if streaming had been off. On that logic, vcl_deliver doesn't "see" resp.status != 200 when that happens.

But the VTC shows a bit of strangeness, because it also seems to depend on whether or not beresp.do_esi is conditionally or unconditionally set to true in VCL. The problem happens if beresp.do_esi is set conditionally, as in e00035.vtc. If VCL just says set beresp.do_esi = true, then the problem doesn't happen in the VTC.

Obviously that takes some explanation. I'll go through the expectations when I add the VTC test case.

Possible Solution

In vcl_backend_response, set beresp.do_esi = true unconditionally. It may also be necessary to set beresp.do_stream = false (but in that case we lose the advantages of streaming).

Steps to Reproduce (for bugs)

I'll add a VTC test case to the issue, and describe what we expected to see.

Context

The site makes very extensive use of ESI, and has been prepared to handle errors in ESI subrequests for quite some time before the attempted migration to Varnish 7.x.

We know of onerror="continue" and -p feature=+esi_include_onerror. The Varnish team can control the -p parameter, but using the onerror attribute in responses depends on all of the developer teams (about 50 of them) always remembering to do so, without exception for every ESI include, now and in the future. One omission and the entire response fails. So the VCL snippet in vcl_deliver is intended to safeguard against that.

Varnish Cache version

varnishd (varnish-7.4.2 revision cd1d10ab53a6f6115b2b4f3b2a1da94c1f749f80)

Operating system

Debian bookworm (12.4)

Source of binary packages used (if any)

Both packagecloud and build from github source

slimhazard commented 5 months ago

Github outrageously refuses to accept attachments with the .vtc extension. So I'll have to include it verbatim in the comment. Comments to follow.

varnishtest "Parent ESI fails despite setting ESP resp.status = 200"

server s1 {
    rxreq
    expect req.url == "/abort"
    txresp  -hdr {surrogate-control: content="ESI/1.0"} \
        -body {before <esi:include src="/fail"/> after}

    rxreq
    expect req.url == "/fail"
    txresp -hdr "content-length: 100" -nolen
    delay 0.1
} -start

varnish v1 -cliok "param.set feature +esi_disable_xml_check"

## Doesn't matter whether this is turned on.
# varnish v1 -cliok "param.set feature +esi_include_onerror"

varnish v1 -vcl+backend {
    sub vcl_backend_response {
        # Prevent cache hits in the next tests.
        set beresp.uncacheable = true;

        ## WEIRD
#       set beresp.do_esi = true;
        set beresp.do_esi = beresp.http.surrogate-control ~ "ESI/1.0";
        unset beresp.http.surrogate-control;
    }

    sub vcl_deliver {
        if (req.esi_level > 0 && resp.status != 200 &&
            resp.status != 204) {
            set resp.status = 200;
        }
    }
} -start

client c1 {
    txreq -url "/abort"
    non_fatal
    rxresp
    expect resp.body == "before "
} -run

server s1 -wait
server s1 -start

varnish v1 -vcl+backend {
    sub vcl_backend_response {
        # Prevent cache hits in the next tests.
        set beresp.uncacheable = true;

        set beresp.do_esi = true;
        set beresp.do_stream = false;
    }

    sub vcl_deliver {
        if (req.esi_level > 0 && resp.status != 200 &&
            resp.status != 204) {
            set resp.status = 200;
        }
    }
}

client c1 {
    txreq -url "/abort"
    non_fatal
    rxresp
    expect resp.body ~ "^before "
    # Guru Meditation from the include in between
    expect resp.body ~ " after$"
} -run

server s1 -wait
server s1 -start

varnish v1 -vcl+backend {
    sub vcl_backend_response {
        ## WEIRDNESS
        set beresp.do_esi = true;
    }

    sub vcl_deliver {
        if (req.esi_level > 0 && resp.status != 200 &&
            resp.status != 204) {
            set resp.status = 200;
        }
    }
}

client c1 -run

slimhazard commented 5 months ago

The VTC test passes, and the first client -run illustrates the problem.

The server response does not have onerror="continue" (so for this test, it doesn't matter whether the esi_include_onerror is turned on). In all three client -runs, the snippet in vcl_deliver that was expected to prevent failure of the parent response is included in VCL.

As in e00035.vtc, the server response is before <esi:include src="/fail"/> after, and the response for /fail fails because it sets Content-Length > 0 but sends an empty response.

In the first client -run, the response body seen by the client is before -- the ESI failed, and the rest of the body with after is left off. This is the unexpected behavior, because we thought the vcl_deliver snippet should have prevented that.

In the last two client -runs, the client sees before <Guru meditation from the ESI> after in resp.body, as we expected. For the second client -run (the first of the two with the expected result), this is a part of vcl_backend_response:

set beresp.do_esi = true;
set beresp.do_stream = false;

Which made me think that the problem is related to streaming.

Now the weird part. In the first client -run, we have this in vcl_backend_response, exactly as in e00035.vtc:

set beresp.do_esi = beresp.http.surrogate-control ~ "ESI/1.0";
unset beresp.http.surrogate-control;

The server sets the Surrogate-Control header with matching contents, so in effect beresp.do_esi is set conditionally, but turns out to be true. This was the case for which the client response is truncated, which was the unexpected result.

In the third client -run, we have just this in vcl_backend_response:

sub vcl_backend_response {
    set beresp.do_esi = true;
}

That is, beresp.do_esi is set unconditionally to true, and beresp.do_stream is left to the default, which is true.

In that case, we don't get the unexpected result. With the snippet in vcl_deliver that sets resp.status = 200 when the ESI level > 0, the client sees before <Guru> after in resp.body. Hence the comments in the VTC about weirdness.

nigoroll commented 5 months ago

What I think is happening here:

For the first client run, do_esi is true only for the request to /abort and, consequently, do_stream is false also only for the the request to /abort. That is, do_stream is true for /fail. VCL sees the status 200 and includes the streaming response of the request to /fail, but then streaming fails and fails also the top request.

For the other two client runs, do_stream is false in both cases, explicitly in the first and explicitly in the second, because of set beresp.do_esi = true; applying to both requests.

Possible solutions:

So, for now, I think the only workaround is to always set do_stream = false for any potentially ESI included content which is not itself ESI.

From a Varnish-Cache perspective, two options come to my mind:

ignore streaming errors for ESI includes
add some option (parameter + VCL variable?) with the same effect as onerror="continue"

dridi commented 5 months ago

Could this be caused by 582ded6a2d6ae1a4467b1eb500f2725b42888016? Can you please try this commit and its parent? (pun intended)

nigoroll commented 5 months ago

Without having tried it, I think this is originally caused by this change from 26097fb091496f79ef9a38b22e3639735fb39280

                req->doclose = SC_REM_CLOSE;

        req->acct.resp_bodybytes += VDP_Close(req->vdc);
+
+       if (i && ecx->abrt) {
+               req->top->topreq->vdc->retval = -1;
+               req->top->topreq->doclose = req->doclose;
+       }
 }

this is where we originally started aborting the parent when an include's delivery fails.

slimhazard commented 5 months ago

@Dridi cherry-picking that commit and its parent onto 7.4.2 results in conflicts. Instead of fiddling through that, I just ran the VTC in this issue against current master, and it passed.

Meaning that master also exhibits the unexpected result. Does that answer what you were getting at?

dridi commented 5 months ago

So... I must admit I didn't read the whole text initially.

I introduced onerror support guarded by a feature flag in https://github.com/varnishcache/varnish-cache/commit/26097fb091496f79ef9a38b22e3639735fb39280 and the default behavior changed in https://github.com/varnishcache/varnish-cache/commit/582ded6a2d6ae1a4467b1eb500f2725b42888016.

I still have not read everything, only the ticket description. I'd like to mention that ESI processing has to disable streaming, because we need to process the beresp body before we can pass the object to clients. So there are no latency savings on an ESI fetch.

I'll read the third comment (after the VTC) later, but from the look of it we lack VCL control over ESI onerror support. We could copy the feature flag to a req flag exposed in VCL upon entering vcl_deliver, and only use this flag to control the ESI delivery.

slimhazard commented 5 months ago

So, for now, I think the only workaround is to always set do_stream = false for any potentially ESI included content which is not itself ESI.

@nigoroll sounds like it might work, but the problem is that in vcl_backend_response, there's no way to know "this beresp is an ESI leaf" (and the site at which the problem was noticed doesn't have a way to determine that, such as in the path or a header).

Come to think of it, that's what resp.can_esi and obj.can_esi do -- they're true if the OA_ESIDATA attribute exists. But I can't think of a way to get from there to something useful in v_b_r -- there's no way to know about ESIDATA until you've parsed the body.

nigoroll commented 5 months ago

@slimhazard what about something like this

sub vcl_miss {
  if (req.esi_level > 0) {
    set req.http.no-stream = "1";
  } else {
    unset req.http.no-stream;
  }
}

sub vcl_backend_response {
  if (bereq.http.no-stream) {
    set beresp.do_stream = false;
  }
}

slimhazard commented 5 months ago

@nigoroll it looks like that works, this test passes against master:

varnishtest "VCL workaround for #4053"

server s1 {
    rxreq
    expect req.url == "/abort"
    txresp -body {before <esi:include src="/fail"/> after}

    rxreq
    expect req.url == "/fail"
    txresp -hdr "content-length: 100" -nolen
    delay 0.1
} -start

varnish v1 -cliok "param.set feature +esi_disable_xml_check"

varnish v1 -vcl+backend {
    sub vcl_miss {
        if (req.esi_level > 0) {
            set req.http.no-stream = "nope";
        } else {
            unset req.http.no-stream;
        }
    }

    sub vcl_backend_response {
        set beresp.do_esi = true;
        if (bereq.http.no-stream) {
            set beresp.do_stream = false;
        }
    }

    sub vcl_deliver {
        if (req.esi_level > 0 && resp.status != 200 &&
            resp.status != 204) {
            set resp.status = 200;
        }
    }
} -start

client c1 {
    txreq -url "/abort"
    non_fatal
    rxresp
    expect resp.body ~ "^before "
    # Guru Meditation from the include in between
    expect resp.body ~ " after$"
} -run

$ bin/varnishtest/varnishtest -i bin/varnishtest/tests/e00101.vtc
#    top  TEST bin/varnishtest/tests/e00101.vtc passed (1.014)

Notice that there's no ## WEIRD part here about setting beresp.do_esi = true.

I'll try the same test against 7.4.2.

slimhazard commented 5 months ago

Can confirm that the VCL workaround works in 7.4.2:

$ ./bin/varnishd/varnishd -V
varnishd (varnish-7.4.2 revision cd1d10ab53a6f6115b2b4f3b2a1da94c1f749f80)
Copyright (c) 2006 Verdens Gang AS
Copyright (c) 2006-2023 Varnish Software
Copyright 2010-2023 UPLEX - Nils Goroll Systemoptimierung

$ bin/varnishtest/varnishtest -i bin/varnishtest/tests/e00101.vtc
#    top  TEST bin/varnishtest/tests/e00101.vtc passed (1.015)

So we have a solution for now at the site, and presumably this helps to understand what's going on in Varnish.

dridi commented 4 months ago

Now I understand the problem better. You have an ESI top request, but sub-requests may be streamed when they aren't ESI requests themselves. And a streaming-but-failing 200 response will still be caught by esi_include_onerror.

@nigoroll it looks like that works, this test passes against master:

Maybe, but this statement makes no difference in the test case:

set beresp.do_stream = false;

You can remove it or comment it out and the test case will still pass. The whole construct only makes sense if do_esi is conditional.

You could do this instead:

varnishtest "VCL workaround for #4053, take 3"

server s1 {
    rxreq
    expect req.url == "/abort"
    txresp -hdr {surrogate-control: content="ESI/1.0"} \
        -body {before <esi:include src="/fail"/> after}

    rxreq
    expect req.url == "/fail"
    txresp -hdr "content-length: 100"
    delay 0.1
} -start

varnish v1 -cliok "param.set feature +esi_disable_xml_check"

varnish v1 -vcl+backend {
    sub vcl_recv {
        set req.http.esi-level = req.esi_level;
    }

    sub vcl_backend_response {
        set beresp.do_esi = beresp.http.surrogate-control ~ "ESI/1.0";
        set beresp.do_stream = bereq.http.esi-level == "0";
    }

    sub vcl_deliver {
        if (req.esi_level > 0 && resp.status != 200 &&
            resp.status != 204) {
            set resp.status = 200;
        }
    }
} -start

client c1 {
    txreq -url "/abort"
    non_fatal
    rxresp
    expect resp.body ~ "^before "
    # Guru Meditation from the include in between
    expect resp.body ~ " after$"
} -run

dridi commented 4 months ago

I'll read the third comment (after the VTC) later, but from the look of it we lack VCL control over ESI onerror support.

I was wrong, but it doesn't mean that it was a bad idea, see #4054.

slimhazard commented 4 months ago

@Dridi thanks for the PR. I put on a thumbs up, so that will help it get merged, of course.

If the PR is merged: it appears to me that, to also have ESI contents included even if onerror="continue" is missing, then in addition to setting resp.esi_include_onerror, they'll also need the part about setting resp.status to 200. Is that correct?

dridi commented 4 months ago

The tricky part about #4054 is that resp.esi_include_onerror is set for the parent request, whereas the resp.status check is performed on direct sub-requests.

So, you could need both fine-grained control of the flag and a status workaround, or just disable the flag when you know the backend doesn't specify onerror="continue". I'm not so sure actually, I should probably add coverage for that.

One thing I'm also wondering is whether to always initialize this from the parameter or do that only for req_top (technically, resp_top except that it does not exist) and then inherit the default value from req_top in sub-requests.

nigoroll commented 4 months ago

During bugwash, @Dridi and myself had a discussion which was partly confusing to me. This comment is an attempt to clarify. Please correct me if you think that I am wrong:

The case we are looking at goes back to 26097fb091496f79ef9a38b22e3639735fb39280, where we would fail the top level esi request if delivery of an ESI sub-request's object resulted in an error and esi_include_onerror = true and the onerror=continue attribute was not present.

In 582ded6a2d6ae1a4467b1eb500f2725b42888016 this was changed, we would now fail the top level esi request when esi_include_onerror = false or the onerror=continue attribute was not present.

The idea was that the new behavior would be closer to the standard and could still be avoided by overwriting the status, as mentioned in the changes.rst entry.

What we overlooked at the time is that, with streaming, we do not have the final status, it could be 200 as inspected by VCL, but be 503 as seen by the ESI VDP.

While we never stream ESI objects, ESIs may include ordinary objects, for which steaming is supported. This is @slimhazard's initial test case.

So the workaround is to disable streaming for non-esi includes of esis.

4054 proposes a sensible thing, to give vcl control over `esi_include_onerror`, but that does not help with this issue. We would need a flag for "always assume onerror=continue".

(edit: bold)

dridi commented 4 months ago

Pretty good summary, thanks.

So the workaround is to disable streaming for non-esi includes of esis.

The VTC I submitted does just that, hopefully disambiguating the current behavior.

4054 proposes a sensible thing, to give vcl control over esi_include_onerror, but that does not help with this issue.

As it states in the description, it was "inspired by" this ticket and never claimed to solve the problem.

We would need a flag for "always assume onerror=continue".

And I believe that would amount to a partial revert of https://github.com/varnishcache/varnish-cache/commit/582ded6a2d6ae1a4467b1eb500f2725b42888016. I think it's a good idea to consider non 200/204 status codes as errors, but that should have been the extent of the change.

With https://github.com/varnishcache/varnish-cache/commit/26097fb091496f79ef9a38b22e3639735fb39280 we already had this behavior of assuming onerror=continue unless the flag is raised. The notion of a failure should have been extended from return(fail)-or-equivalent-only to return(fail)-or-unfit-status, and nothing more.

nigoroll commented 4 months ago

How about: esi_include_onerror = attribute | abort | continue ?

attribute: continue only if onerror="continue" is present abort: stop ESI processing on any error, irrespective of the attribute continue continue ESI processing as before 26097fb091496f79ef9a38b22e3639735fb39280

(and same for the new vcl variable)

nigoroll commented 4 months ago

As we are at it: The ESI TR specifies:

If an ESI Processor can fetch neither the src nor the alt, it returns a HTTP status code greater than 400 with an error message, unless the onerror attribute is present. If it is, and onerror="continue" is specified, ESI Processors will delete the include element silently.

The first aspect to return a >400 would imply buffering the ESI response until all subrequests are done, which we certainly won't do, so I think we should decide to deliberately not follow the TR.

But regarding the second aspect: This would, IMHO, imply to disable streaming and replace included objects with empty strings. So maybe we should document that, even with something like esi_include_onerror = attribute, one would need special VCL handling to be ESI compliant to ensure that an error include's body is discarded (e.g. by going to synth()).

dridi commented 4 months ago

I wrote a shorter test case:

varnishtest "esi kerfluffle"

server s1 {
        rxreq
        expect req.http.esi-level == 0
        txresp -body {before <esi:include src="/fail"/> after}

        rxreq
        expect req.http.esi-level == 1
        txresp -status 500 -hdr "transfer-encoding: chunked"
        delay 0.1
        chunked 500
} -start

varnish v1 -cliok "param.set feature +esi_disable_xml_check"

varnish v1 -vcl+backend {
        sub vcl_recv {
                set req.http.esi-level = req.esi_level;
        }
        sub vcl_backend_response {
                set beresp.do_esi = bereq.http.esi-level == "0";
        }
        sub vcl_deliver {
                if (req.esi_level > 0 && resp.status != 200) {
                        set resp.status = 200;
                }
        }
} -start

client c1 {
        txreq
        rxresp
        expect resp.body == "before 500 after"
} -run

It fails since https://github.com/varnishcache/varnish-cache/commit/582ded6a2d6ae1a4467b1eb500f2725b42888016 that really changed the behavior of onerrror (or lack thereof) interpretation.

It should have only added "undesirable status codes" to the list of things considered an error and let the onerror handling as-is.

If there is one thing to fix, it's that.

nigoroll commented 4 months ago

FYI:

We (@slimhazard, myself and a Varnish-Cache user) have now established that the VCL code to restore the Varnish-Cache 7.2 behavior is a little more involved than originally assumed.

The required workaround is something along these lines (no news in here, just a summary):

in vcl_miss {} and vcl_pass {}

if (req.esi_level > 0) {
  set req.http.no-stream = "nope";
} else {
  unset req.http.no-stream;
}

in vcl_backend_response {}

if (bereq.http.no-stream) {
  set beresp.do_stream = false;
}

in vcl_synth {} and vcl_deliver {} before every return (deliver):

if (req.esi_level > 0 && resp.status != 200 && resp.status != 204) {
  set resp.status = 200;
}

We are now working on a simplified solution for pESI, which we would like to give up once we have settled on a solution for Varnish-Cache.

In my mind, this could be to

either replace the esi_include_onerror feature flag with a tri-state esi_include_onerror = attribute | abort | continue
or add another flag esi_include_onerror_override
or, as Dridi suggested, a partial revert of a partial revert of https://github.com/varnishcache/varnish-cache/commit/582ded6a2d6ae1a4467b1eb500f2725b42888016, but we would need to discuss how this would look exactly.

In addition, all solutions should come with VCL control.

bsdphk commented 4 months ago

I agree that this needs to always be under VCL control. On the other hand, I'm not keen on this becoming more complicated than it already is :-/

dridi commented 4 months ago

I submitted my suggestion: https://github.com/varnishcache/varnish-cache/pull/4065.

varnishcache / varnish-cache

Parent ESI fails even when VCL sets resp.status = 200 for the included ESI #4053

Expected Behavior

Current Behavior

Possible Solution

Steps to Reproduce (for bugs)

Context

Varnish Cache version

Operating system

Source of binary packages used (if any)

4054 proposes a sensible thing, to give vcl control over `esi_include_onerror`, but that does not help with this issue. We would need a flag for "always assume onerror=continue".

4054 proposes a sensible thing, to give vcl control over `esi_include_onerror`, but that does not help with this issue.

varnishcache / varnish-cache

Parent ESI fails even when VCL sets resp.status = 200 for the included ESI #4053

Expected Behavior

Current Behavior

Possible Solution

Steps to Reproduce (for bugs)

Context

Varnish Cache version

Operating system

Source of binary packages used (if any)

4054 proposes a sensible thing, to give vcl control over esi_include_onerror, but that does not help with this issue. We would need a flag for "always assume onerror=continue".

4054 proposes a sensible thing, to give vcl control over esi_include_onerror, but that does not help with this issue.

4054 proposes a sensible thing, to give vcl control over `esi_include_onerror`, but that does not help with this issue. We would need a flag for "always assume onerror=continue".

4054 proposes a sensible thing, to give vcl control over `esi_include_onerror`, but that does not help with this issue.