varnishcache / varnish-cache

Varnish Cache source code repository
https://www.varnish-cache.org
Other
3.68k stars 377 forks source link

Assert error in ban_lurker_getfirst(), cache/cache_ban_lurker.c line 177 #2681

Closed huguesalary closed 6 years ago

huguesalary commented 6 years ago

A few weeks ago, varnish panicked:

Panic at: Wed, 25 Apr 2018 23:56:17 GMT
Assert error in ban_lurker_getfirst(), cache/cache_ban_lurker.c line 177:
  Condition((oc->flags & OC_F_BUSY) == 0) not true.
version = varnish-6.0.0 revision a068361dff0d25a0d85cf82a6e5fdaf315e06a7d, vrt api = 7.0
ident = Linux,4.4.86+,x86_64,-junix,-smalloc,-sdefault,-hcritbit,epoll
now = 6712648.985517 (mono), 1524700559.199682 (real)
Backtrace:
  0x556ab63a0957: varnishd(+0x4a957) [0x556ab63a0957]
  0x556ab6405730: varnishd(VAS_Fail+0x40) [0x556ab6405730]
  0x556ab63844e9: varnishd(ban_lurker+0xba9) [0x556ab63844e9]
  0x556ab63bf747: varnishd(+0x69747) [0x556ab63bf747]
  0x7f2108393494: /lib/x86_64-linux-gnu/libpthread.so.0(+0x7494) [0x7f2108393494]
  0x7f21080d5acf: /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7f21080d5acf]
thread = (ban-lurker)
thr.req = (nil) {
},
thr.busyobj = (nil) {
},

I believe this panic happens on a rare basis.

Some more info:

uname -a
Linux production-varnish-6dc94c6584-9ns27 4.4.86+ #1 SMP Thu Dec 7 20:11:11 PST 2017 x86_64 GNU/Linux

cat /etc/debian_version
9.2

Not sure that matters, but, Varnish is running inside a docker container.

nigoroll commented 6 years ago

this ban-luker induced panic is missing the vmod info. Could you please add the list of vmods you are using?

huguesalary commented 6 years ago

Of course, here's the top of my vcl file:

vcl 4.1;

import std;
import directors;

No other vmods are being used.

nigoroll commented 6 years ago

ok, that's harmless. Any chance you could use a build from current source or a weekly package?

nigoroll commented 6 years ago

never mind. Actually this looks pretty simple. Interesting to see this reported only now....

huguesalary commented 6 years ago

Glad I could help ;)

nigoroll commented 6 years ago

FTR, I'd say this was introduced with the fix for #1449 99851a10c7314dd6a4eb6a0de60161732b1b7b55 when we moved the ban insert before clearing the busy flag

hermunn commented 6 years ago

Back ported to 4.1 as 7906a41782a39.

hermunn commented 6 years ago

@nigoroll Should the test also include stolen objects? In other words, should we also have || oc->flags & OC_F_STOLEN in there?

nigoroll commented 6 years ago

@hermunn , there is no OC_F_STOLEN?

hermunn commented 6 years ago

Sorry, please ignore me from now on. :)