With #4048 applied, the test case from https://github.com/nigoroll/libvmod-dynamic/issues/110 can easily trigger this assertion failure when a lookup thread takes longer than cli_timeout to finish (I first saw this by accident while träwelling on an ICE train):
*** v1 debug|Error: Child (501422) not responding to CLI, killed it.
*** v1 debug|Assert error in mgt_vcl_discard(), mgt/mgt_vcl.c line 701:
*** v1 debug| Condition((vp->warm) == 0) not true.
This happens when the mgt_vcl_askchild() times out, because the worker process is processing the cold event for longer.
bugwash: just dial up cli_timeout to some infinity-ish value. good point though: phk: slink, we may want to use a smaller timeout still for the MGT's ping/pong's...
With #4048 applied, the test case from https://github.com/nigoroll/libvmod-dynamic/issues/110 can easily trigger this assertion failure when a lookup thread takes longer than
cli_timeout
to finish (I first saw this by accident while träwelling on an ICE train):This happens when the
mgt_vcl_askchild()
times out, because the worker process is processing the cold event for longer.This hack avoids the issue:
How should we fix this issue? Some ideas:
mgt_vcl_discard()