Closed jonhattan closed 4 years ago
This is the relevant part of our systemd unit:
ExecStart=/usr/sbin/varnishd -j unix,user=varnish,ccgroup=varnish \
-P /var/run/varnish.pid \
-t 120 \
-f /etc/varnish/puppet.vcl \
-a 0.0.0.0:6081 -a 0.0.0.0:6083,PROXY \
-T 127.0.0.1:6082 \
-p thread_pool_min=50 \
-p thread_pool_max=1000 \
-p thread_pool_timeout=120 \
-p workspace_backend=128k \
-S /etc/varnish/secret \
-s malloc,20G \
Upon more investigation on the code, the assert that fails is
773 dbits = WS_Alloc(req->ws, 8);
774 AN(dbits);
And the backtrace says:
ws = 0x7fd4d32f2170 {
OVERFLOWED id = \"Req\",
{s, f, r, e} = {0x7fd4d32f40a8, +57168, (nil), +57168},
We already have workspace_backend=128k
. We'll increase its size to test again and report back.
notice this is workspace_client
overflowing.
Also the panic is fixed in 3fbdda3d7335e806dd22ebde49a6fb7ceda6e14a, so not relevant for master
@nigoroll thanks! With workspace_client
at 128KB we've not seen panics in three hours of stress load.
Is it planned to backport 3fbdda3 to 6.x?
@Dridi can you answer the backport question
Kind of on the fence if this meets the threshold for backporting, which is currently bug fixes and features. This looks like a workspace issue. Do we have more data on how problematic this exact workspace allocation is?
@rezan @jonhattan can we close this issue?
yes, ok to close. I have marked this as a possible backport candidate, but no final decision has been made.
Thanks both! We've not experienced any problem since we increased workspace_client to 128KB
It seems the same as #1953. We're experiencing panics on stress load.
Running
varnish:amd64/stretch 6.0.3-1~stretch uptodate
. This is the trace: