Closed obliadp closed 6 years ago
Is this reproducible?
there are two places in HSH_Lookup
where we call VRY_Match
, this is the one where we look into an existing object:
0x000000000042edab <+843>: callq 0x437bd0 <ObjHasAttr>
0x000000000042edb0 <+848>: test %eax,%eax
0x000000000042edb2 <+850>: je 0x42edd9 <HSH_Lookup+889>
0x000000000042edb4 <+852>: xor %ecx,%ecx
0x000000000042edb6 <+854>: mov $0x5,%edx
0x000000000042edbb <+859>: mov %r15,%rsi
0x000000000042edbe <+862>: mov %r12,%rdi
0x000000000042edc1 <+865>: callq 0x437b20 <ObjGetAttr>
0x000000000042edc6 <+870>: mov %rbx,%rdi
0x000000000042edc9 <+873>: mov %rax,%rsi
0x000000000042edcc <+876>: callq 0x4451b0 <VRY_Match>
0x000000000042edd1 <+881>: test %eax,%eax
->
if (ObjHasAttr(wrk, oc, OA_VARY)) {
vary = ObjGetAttr(wrk, oc, OA_VARY, NULL);
if (!VRY_Match(req, vary))
continue;
}
So the interesting question is why we have a NULL
OA_VARY
Attribute.
The place where we add it is vbf_beresp2obj
but there we got asserts that vary != NULL
IFF varyl > 0
. So while ObjSetAttr
does allow setting attributes with NULL
values, I don't see (yet) why this is happening.
I'd say the issue is genuine, #2130 was basically the same and closed without a real change (but the assert added which we're hitting here).
I made no progress staring at the code. @obliadp any chance to get hold of a core dump?
No core dump available atm I'm afraid, I'll try to produce one. Meanwhile I've collected a panic, relevant varnishlog and varnishgather on filebin. Left the url with Espen as I couldn't find @nigoroll on irc.
We've reverted to 4.1 for production, but might this still be interesting to look into?
@nigoroll is slink on irc. The url has been delivered.
@obliadp yes I definitely want to see this one resolved and I also received the tarball via espen. But unfortunately that does not give me any more clues regarding the root cause. A core dump would still be the most helpful, but I understand that you need to provide a stable service.
I'm back at this and think the problem here is that backend_error frees and recreated a storage object combined with the fact that freeobj clears the object, but not oa_present.
Expected Behavior
Current Behavior
Occassionaly my varnishes empty themselves and this assert is logged:
This might be related to the old https://varnish-cache.org/trac/ticket/1304 , but I've tried increasing all workspaces (except thread) to rather alarming sizes:
Your Environment