varnish / varnish-modules

Collection of Varnish Cache modules (vmods) by Varnish Software
Other
182 stars 86 forks source link

libvmod-var: Panic on var.set after std.rollback #222

Closed delthas closed 6 months ago

delthas commented 7 months ago

Hi, I get a consistent panic with the following minimal (not) working example VTC, on Varnish 7.4.2 + varnish-modules master:

varnishtest "Test that var works across a rollback"

server s1 {
       rxreq
       txresp
} -start

varnish v1 -vcl+backend {
    import std;
    import var from "${vmod_builddir}/.libs/libvmod_var.so";

    sub vcl_backend_fetch {
        var.get("foo");
        std.rollback(bereq);
        var.set("foo", "bar");
    }
} -start

client c1 {
    txreq -url "/"
    rxresp
} -run

(Feel free to use it as you wish as a regression test if needed.)

Panic:

**** v1    CLI RX|Assert error in get_vh(), vmod_var.c line 156:
**** v1    CLI RX|  Condition(((vh))->magic == (0x64F33E2F)) not true.
**** v1    CLI RX|Backtrace:
**** v1    CLI RX|  ip=0x62c6a263c69e sp=0x70fa9c8cb280 <pan_ic+0x34e>
**** v1    CLI RX|  ip=0x62c6a270c6b5 sp=0x70fa9c8cb410 <VAS_Fail+0x55>
**** v1    CLI RX|  ip=0x70fa9c1866ff sp=0x70fa9c8cb460 <get_vh+0xd3>
**** v1    CLI RX|  ip=0x70fa9c186790 sp=0x70fa9c8cb490 <vmod_set_string+0x2b>
**** v1    CLI RX|  ip=0x70fa9c186735 sp=0x70fa9c8cb4d0 <vmod_set+0x30>
**** v1    CLI RX|  ip=0x70fa9c1a3a98 sp=0x70fa9c8cb500 <VGC_function_vcl_backend_fetch+0x1c8>
**** v1    CLI RX|  ip=0x62c6a2675e3b sp=0x70fa9c8cb540 <vcl_call_method+0x7fb>
**** v1    CLI RX|  ip=0x62c6a2676180 sp=0x70fa9c8cb690 <VCL_backend_fetch_method+0x1e0>
**** v1    CLI RX|  ip=0x62c6a261e2dc sp=0x70fa9c8cb6d0 <vbf_stp_startfetch+0x29c>
**** v1    CLI RX|  ip=0x62c6a261d3b8 sp=0x70fa9c8cb730 <vbf_fetch_thread+0x7a8>
**** v1    CLI RX|  ip=0x62c6a267ca55 sp=0x70fa9c8cb820 <Pool_Work_Thread+0x895>
**** v1    CLI RX|  ip=0x62c6a267c043 sp=0x70fa9c8cb8b0 <WRK_Thread+0x3e3>
**** v1    CLI RX|  ip=0x62c6a267bc1b sp=0x70fa9c8cc430 <pool_thread+0xcb>
**** v1    CLI RX|  ip=0x70fa9cea955a sp=0x70fa9c8cc460 <pthread_condattr_setpshared+0x67a>
**** v1    CLI RX|  ip=0x70fa9cf26a3c sp=0x70fa9c8cc500 <__clone+0x20c>
delthas commented 7 months ago

Of note is that I do not get a panic if I remove the var.get call before std.rollback. (I also get a panic if I do var.set; std.rollback; var.set)

delthas commented 7 months ago

I get a different panic with the following (two var calls after the rollback, instead of one before and one after); though I suppose the root cause is likely the same:

varnishtest "Test that var works across a rollback"

server s1 {
       rxreq
       txresp
} -start

varnish v1 -vcl+backend {
    import std;
    import var from "${vmod_builddir}/.libs/libvmod_var.so";

    sub vcl_backend_fetch {
        std.rollback(bereq);
        var.set("foo", "bar");
        var.set("foo", "bar");
    }
} -start

client c1 {
    txreq -url "/"
    rxresp
} -run
**** v1    CLI RX|Wrong turn at cache/cache_main.c:327:
**** v1    CLI RX|Signal 11 (Segmentation fault) received at 0x7e4c006f6f66 si_code 1
**** v1    CLI RX|Backtrace:
**** v1    CLI RX|  ip=0x59caa62a269e sp=0x7e4c2d2912a0 <pan_ic+0x34e>
**** v1    CLI RX|  ip=0x59caa63726b5 sp=0x7e4c2d291430 <VAS_Fail+0x55>
**** v1    CLI RX|  ip=0x59caa629cf30 sp=0x7e4c2d291480 <child_signal_handler+0x320>
**** v1    CLI RX|  ip=0x7e4c2d754770 sp=0x7e4c2d2919c0 <__sigaction+0x50>
**** v1    CLI RX|  ip=0x7e4c2c5da6c8 sp=0x7e4c2cf12460 <get_vh+0x9c>
**** v1    CLI RX|  ip=0x7e4c2c5da790 sp=0x7e4c2cf12490 <vmod_set_string+0x2b>
**** v1    CLI RX|  ip=0x7e4c2c5da735 sp=0x7e4c2cf124d0 <vmod_set+0x30>
**** v1    CLI RX|  ip=0x7e4c2c5f7a9f sp=0x7e4c2cf12500 <VGC_function_vcl_backend_fetch+0x1cf>
**** v1    CLI RX|  ip=0x59caa62dbe3b sp=0x7e4c2cf12540 <vcl_call_method+0x7fb>
**** v1    CLI RX|  ip=0x59caa62dc180 sp=0x7e4c2cf12690 <VCL_backend_fetch_method+0x1e0>
**** v1    CLI RX|  ip=0x59caa62842dc sp=0x7e4c2cf126d0 <vbf_stp_startfetch+0x29c>
**** v1    CLI RX|  ip=0x59caa62833b8 sp=0x7e4c2cf12730 <vbf_fetch_thread+0x7a8>
**** v1    CLI RX|  ip=0x59caa62e2a55 sp=0x7e4c2cf12820 <Pool_Work_Thread+0x895>
**** v1    CLI RX|  ip=0x59caa62e2043 sp=0x7e4c2cf128b0 <WRK_Thread+0x3e3>
**** v1    CLI RX|  ip=0x59caa62e1c1b sp=0x7e4c2cf13430 <pool_thread+0xcb>
**** v1    CLI RX|  ip=0x7e4c2d7a355a sp=0x7e4c2cf13460 <pthread_condattr_setpshared+0x67a>
**** v1    CLI RX|  ip=0x7e4c2d820a3c sp=0x7e4c2cf13500 <__clone+0x20c>
delthas commented 7 months ago

OK, taking a step back: std.rollback also resets all PRIV_TASK objects (in which the variables are stored). In other words, std.rollback deletes request variables by design.

A possible "solution" to this would be to document that libvmod-var does not support std.rollback at all; or to detect/allow rollbacks (but we start over with no variables, because the previous ones were deleted).

gquintard commented 7 months ago

hi, sorry, was a bit busy, I talked to the team and essentially you need to immediately return(retry|restart) after std.rollback(), but it was never documented. So, I will open a ticket on the varnish github so we can fix it, and I'm keeping this open until it's merged.

gquintard commented 7 months ago

https://github.com/varnishcache/varnish-cache/pull/4060

gquintard commented 6 months ago

looks like they fixed the issue upstream, cloging