Closed delthas closed 7 months ago
Can also reproduce against Varnish master + libvmod-dynamic master
I investigated further and found the cause: r->dir is NULL because it is not set in the first place.
The precondition is that we are in a cooling state when processing the lookup results (for example, we just loaded and immediately discarded a VCL with a director initialized in its vcl_init).
Then:
Later, when actually discarding the VCL, we go through the refs, and expect to find a ref->dir != NULL, then panic/assert. But the root issue is actually when adding the ref.
I expect that in the case VRT_new_backend returns NULL, we instead skip the ref or something similar.
It looks very similar to https://github.com/nigoroll/libvmod-dynamic/issues/108 (we try to create a director while not warm); but here we are adding while COOLING, whereas in the other ticket I think we are adding while COOL.
Thank you fr your work on both tickets. I have seen them and I think the way forward is to get a reference on the vcl while the resolver thread is running. I just need to prioritize other work at the moment.
I hope to have fixed this issue and would appreciate feedback.
Note: I believe this can only work with https://github.com/varnishcache/varnish-cache/pull/4037 / https://github.com/varnishcache/varnish-cache/commit/c44bd67f25216746311b3f65ba39552a8971934b in place because we rely on the vcl reference from VRT_VCL_Prevent_Discard()
to ensure that we are in VCL_COOLING
and not VCL_COLD
. Before the aforementioned commit, this order was wrong.
I found an assert with a simple minimum working example.
Running against Varnish 7.4.2, against libvmod-dynamic master.
This happens when discarding a VCL.
To reproduce, here's a failing VTC (timeout, but the process actually crashes)
You can't see the panic when using the VTC, so I'm instead doing:
crash.vcl:
reload.sh:
And with this I can systematically reproduce the panic and get the trace.