Closed gquintard closed 4 months ago
Should we also restrict std.rollback
to subroutines where return (re{start,try})
are permitted ?
Also, note that CI is failing due to the space before the closing **
thanks @walid-git , fixed
I'd like to understand this better. IIRC, we want to recommend a retry/restart soon after the rollback, but I would think that changes should still be possible.
Note from bugwash: please $Restrict
std.rollback
to vcl subroutines where return (restart/retry)
are valid
I agree we should recommend a restart/retry shortly after the rollback, but not require it and I would consider this issue an actual bug in Varnish-Cache.
What happens here is that VCL assigns VRT_priv_{task,top}()
to pointers in the preamble of each VGC function for a sub, so when rollback happens within the sub, these pointers become stale:
vcl:
sub t {
var.set("foo", "bar");
std.rollback(req);
var.set("foo", "bar");
}
vgc:
VGC_function_t(VRT_CTX, enum vcl_func_call_e call,
enum vcl_func_fail_e *failp)
{
assert(call == VSUB_STATIC);
assert(failp == NULL);
assert(ctx->method & (VCL_MET_DELIVER|VCL_MET_HASH|VCL_MET_HIT|VCL_MET_MISS|VCL_MET_PASS|VCL_MET_PURGE|VCL_MET_RECV|VCL_MET_SYNTH));
struct vmod_priv *ARG_priv_task_var = VRT_priv_task(ctx, &VGC_vmod_var);
if (ARG_priv_task_var == NULL) {
VRT_fail(ctx, "failed to get task priv for vmod var");
return;
}
/* ... from ('/tmp/t.vcl' Line 5 Pos 7) */
{
{
VPI_count(ctx, VGC_NREFS, 1);
if (UNLIKELY(ctx->vpi->trace)) VPI_trace(ctx, 1);
END_;
Vmod_vmod_var_Func.f_set(ctx,
ARG_priv_task_var,
"foo",
"bar"
);
END_;
Vmod_vmod_std_Func.f_rollback(ctx,
VRT_r_req(ctx)
);
END_;
Vmod_vmod_var_Func.f_set(ctx,
ARG_priv_task_var,
"foo",
"bar"
);
END_;
}
}
}
For the first Vmod_vmod_var_Func.f_set()
call, ARG_priv_task_var
is valid, but for the second, it is not, because after the call to Vmod_vmod_std_Func.f_rollback()
, the workspace on which the priv_task state resides, is reset.
Note that this is only an issue for VMODs using the PRIV_TASK
and PRIV_TOP
arguments in the vcc file. VMODs using VRT_priv_{task,top}()
explicitly get a fresh value for each call, and thus do not have this issue.
I see the following options to avoid this:
VRT_priv_{task,top}()
call from the preamble to before each function callstd.rollback
to invalidate
- make the preamble an explicit data structure (probably a bound array), which gets passed to
std.rollback
to invalidate
Probably the other way around. A data structure that std.rollback()
would find through the VPI.
addition: we could also retire the PRIV_TASK
and PRIV_TOP
vmod function/method arguments and simply require vmods to call the respective VRT functions.
Re @Dridi
Probably the other way around. A data structure that std.rollback() would find through the VPI.
Yes, and I would like to add that this data structure does already exist and that std.rollback()
already finds it, so this new data structure would basically be a "per sub" view on it. Not sure if it's worth it...
updated, thanks @walid-git for the rst
fix
Following up on https://github.com/varnishcache/varnish-cache/pull/4060#issuecomment-1964640980
I agree we should recommend a restart/retry shortly after the rollback, but not require it and I would consider this issue an actual bug in Varnish-Cache. ...
I came up with an idea for a variation of the first idea, which now appears to me as most reasonable. So in total, I see these options now:
VRT_priv_{task,top}()
call with error checking from the preamble to before each function callstd.rollback
to invalidate the "per subroutine pointers" somehowmaybe, just maybe, it's ok now
We have merged #4066, so the doc change is no longer adequate.
Also, shouldn't the $Restrict
ion be dropped now?
just a docfix to save users from hitting vcl-triggered panics, like here: https://github.com/varnish/varnish-modules/issues/222