nigoroll / libvmod-dynamic

The Varnish dns/named director continued
BSD 2-Clause "Simplified" License
95 stars 34 forks source link

deadlock between lookup thread / VRT_AddDirector() and VDI_Event(..., VCL_EVENT_COLD) #110

Closed nigoroll closed 7 months ago

nigoroll commented 7 months ago

@delthas reported another issue based on a test case which I could have looked at in more detail earlier (I did not because I did not want to use example.com, but I really should have):

(I have slightly modified the test case - @delthas, do you agree to add it?)

varnishtest "Reloading after creating backend in init"

feature cmd "getent hosts example.com"

# contributed by delthas, added with minor modifications

shell {
    cat >${tmpdir}/f1 <<-EOF
    vcl 4.1;
    import ${vmod_dynamic};
    import directors;

    backend none none;

    sub vcl_init {
        new dir = directors.fallback();
        new res = dynamic.resolver();
        new dyn = dynamic.director(domain_usage_timeout = 0.1s, resolver = res.use());
        dir.add_backend(dyn.backend(host={"example.com"}, port={"80"}) );
    }
    EOF
}

varnish v1 -cliok "vcl.load vcl1 ${tmpdir}/f1"
varnish v1 -cliok "vcl.use vcl1"
varnish v1 -cliok "start"

varnish v1 -cliok "vcl.load vcl2 ${tmpdir}/f1"
varnish v1 -cliok "vcl.use vcl2"
varnish v1 -cliok "vcl.discard vcl1"

the issue here is that VDI_Event(..., VCL_EVENT_COLD) waits for the lookup threads to finish while holding the vcl_mtx, which prevents the lookup threads to ... finish:

#5  0x00007fa9c72b0e76 in dom_event (dir=<optimized out>, ev=<optimized out>) at vmod_dynamic.c:1041
#6  0x000055d6f1c1e84d in VDI_Event (d=0x7fa9c6e1e7d0, ev=VCL_EVENT_COLD) at cache/cache_director.c:277
#7  0x000055d6f1c61bfc in vcl_BackendEvent (vcl=0x7fa9c6e451e0, e=VCL_EVENT_COLD) at cache/cache_vcl.c:449
#8  0x000055d6f1c61068 in vcl_set_state (vcl=0x7fa9c6e451e0, state=0x7fa9c6e28390 "0cold", msg=0x7ffc69699e08) at cache/cache_vcl.c:597
#9  0x000055d6f1c62905 in vcl_cli_state (cli=0x7fa9c6ed29d0, av=0x7fa9c6e2c580, priv=0x0) at cache/cache_vcl.c:861
#10 0x000055d6f1d11f22 in cls_dispatch (cli=0x7fa9c6ed29d0, cs=0x7fa9c6e22b30, av=0x7fa9c6e2c580, ac=3) at vcli_serve.c:273
#11 0x000055d6f1d11a0d in cls_exec (cfd=0x7fa9c6ed29a0, av=0x7fa9c6e2c580, ac=3) at vcli_serve.c:324
#12 0x000055d6f1d108a6 in cls_feed (cfd=0x7fa9c6ed29a0, p=0x7ffc69699f94 "\n2.1706557731.695251/vgc.so 1auto\no\nloneTable", 
    e=0x7ffc69699f95 "2.1706557731.695251/vgc.so 1auto\no\nloneTable") at vcli_serve.c:412
#13 0x000055d6f1d1036a in VCLS_Poll (cs=0x7fa9c6e22b30, cli=0x7fa9c6ed29d0, timeout=-1) at vcli_serve.c:617
#14 0x000055d6f1c18c12 in CLI_Run () at cache/cache_cli.c:110
#4  0x000055d6f1c40ca4 in Lck__Lock (lck=0x55d6f1db57f8 <vcl_mtx>, p=0x55d6f1d45aec "VRT_AddDirector", l=212) at cache/cache_lck.c:124
#5  0x000055d6f1c7bfd3 in VRT_AddDirector (ctx=0x7fa9bbd7df90, m=0x55d6f1da7260 <vbe_methods_noprobe>, priv=0x7fa9ba821780, 
    fmt=0x55d6f1d42f7f "%s") at cache/cache_vrt_vcl.c:212
#6  0x000055d6f1c09c4d in VRT_new_backend_clustered (ctx=0x7fa9bbd7df90, vc=0x0, vrt=0x7fa9bbd7deb0, via=0x0)
    at cache/cache_backend.c:737
#7  0x000055d6f1c0a632 in VRT_new_backend (ctx=0x7fa9bbd7df90, vrt=0x7fa9bbd7deb0, via=0x0) at cache/cache_backend.c:755
#8  0x00007fa9c72b24b4 in ref_add (r=0x7fa9ba827720, ctx=0x7fa9bbd7df90) at vmod_dynamic.c:617
#9  dom_update (now=1706557732.7551613, priv=<optimized out>, res=0x7fa9c72c2ba0 <res_getdns>, dom=0x7fa9c6e44100)
    at vmod_dynamic.c:728
#10 dom_lookup_thread (priv=0x7fa9c6e44100) at vmod_dynamic.c:819
#11 0x00007fa9c77c9044 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#12 0x00007fa9c784961c in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
nigoroll commented 7 months ago

As is, this issue is hard. I could easily add a workaround, but that "workaround" basically also exists in Varnish-Cache, because a shortcut is taken in VRT_AddDirector() if the VCL is cooling, except in vtc mode:

https://github.com/varnishcache/varnish-cache/blob/2511f5364489c8345b3879a5eee21179bf71e9b2/bin/varnishd/cache/cache_vrt_vcl.c#L175-L176

So for all practical purposes, this issue exists primarily in VTC, but we can not guarantee that it does not happen in production deployments.

Holding the vcl_mtx goes back to https://github.com/varnishcache/varnish-cache/issues/3094 / https://github.com/varnishcache/varnish-cache/commit/465f2f8c364cb7e5c6ae93bf46f2ba0f7c757d92 and I do now wonder if this is actually a good idea for sending VCL_EVENT_COLD....

nigoroll commented 7 months ago

More pondering: When sending the cold event, we need to ensure that the list of directors operated upon is complete, so holding the vcl_mtx makes sense, otherwise we would need a more elaborate data structure and synchronization to allow additions/deletions while iterating. The temperature can only change in the CLI thread, so it can not change again while a cold event is posted. Also, once the temperature goes cold, it can not change back to something else for as long as the lookup thread is running. Maybe this can help us...

edit: the actual temperature change happens outside vcl_mtx

delthas commented 7 months ago

Hi, I agree to add the test case :smile:

nigoroll commented 7 months ago

some notes: we can not join the resolver thread after the COLD transition has completed, because by then (during discard) the resolver context may have become invalid (for the same reason we can not detach it). I am pretty much out of ideas even including changes to varnish-cache: for example, turning vcl_mtx into an rwlock would at first seem like an option (iterators only need to read), but in VRT_AddDirector(), we would need to check the temperature on a read lock and, if it is not cooling, upgrade to a write lock, but that can deadlock, and the moment we unlock, vcl_BackendEvent() could run again and we never get the lock. I am really tempted to ignore the problem, because the race for the window in production code (without vtc_mode) is only https://github.com/varnishcache/varnish-cache/blob/2511f5364489c8345b3879a5eee21179bf71e9b2/bin/varnishd/cache/cache_vrt_vcl.c#L178-L211

nigoroll commented 7 months ago

At this point I think this can only be fixed in varnish-cache, see https://github.com/varnishcache/varnish-cache/pull/4048

nigoroll commented 7 months ago

Fixed via https://github.com/varnishcache/varnish-cache/pull/4048