Closed dridi closed 4 months ago
Rebased against master and force-pushed to solve a merge conflict introduced by 083505193a5065fab0359568816b5e733b69edc3.
Rebased against master and force-pushed to solve a merge conflict introduced by 7aeca4d63.
I looked at this again, and overall, I would like to ask for the following:
For one, I would prefer to split refactoring and semantic changes: I am not convinced of the HSH_Lookup()
change as proposed, and IIUC is has nothing to do with the rest of the PR.
IIUC, the other changes can be summarized as:
hash_objhead
and just use req->objcore
for the waitinglisthsh_rush_match()
I do not see how we need to refactor so much to achieve these goals, and I would prefer if we first addressed these based on the existing interfaces (with some adjustments, maybe).
In particular, the OC_F_BUSY
change looks dubious to me.
But most importantly, there is something about an argument from 0a87e86753c8e725dff545bd91f7c0c7b7bad76f which I do not understand:
Knowing that a rushed object is guaranteed to lead to a cache hit allows the rush policy to be applied wholesale
I really do not understand how this patch series achieves this, I don't even understand how it can be achieved at all for the general (Vary) case. Yes, this is true if the busy object has no Vary
(we do not know if the busy object has a Vary
before it gets unbusied), but in general, do I not understand or is it just not true?
IIUC, the other changes can be summarized as:
- Retire
hash_objhead
and just usereq->objcore
for the waitinglist- Optimize HSH_Lookup with
hsh_rush_match()
- Rush earlier when we transition to a private object
That's a summary capturing the essential changes, but slightly off.
The req->hash_objhead
retirement is not so much about "just using" req->objcore
(req->waitinglist
's scope is also extended) but rather about driving the rush policy from an "unbusied" objcore.
The hsh_rush_match()
function is not here as an optimization, but part of the removal of waiting list seralization (modulus incompatible variants).
And the new rush "latches" are not so much here to rush earlier but really to nail consistent semantics for OC_F_BUSY
(and in that regard other rush paths go away for the same reason).
Knowing that a rushed object is guaranteed to lead to a cache hit allows the rush policy to be applied wholesale
I really do not understand how this patch series achieves this, I don't even understand how it can be achieved at all for the general (Vary) case.
It does so as far as compatible variants go, so a Vary: User-Agent
header could still trigger severe serialization when coupled with zero TTL and grace. I never claimed otherwise, and this is what I made my first presentation on the topic.
Please check slide 12: https://github.com/varnishcache/varnish-cache/wiki/VDD23Q1#compliance
but in general, do I not understand or is it just not true?
I think you missed it from 0a87e86753c8e725dff545bd91f7c0c7b7bad76f's commit message:
This shifts the infamous waiting list serialization phenomenon to the vary header match.
Rewinding in your comment:
I do not see how we need to refactor so much to achieve these goals, and I would prefer if we first addressed these based on the existing interfaces (with some adjustments, maybe).
In particular, the
OC_F_BUSY
change looks dubious to me.
Some of the refactoring (like the lookup split in two steps) is me feeling uneasy about touching this part, and in particular the locking logic. Separating the lookup search from lookup outcome really helped me (and you seemed to appreciate one of the effects on is_hitmiss
handling but I can no longer find your comment). The rest of the refactoring is me preparing the landscape to land the change.
I can try a patch series without the non-essential refactoring. You found a few things that require the change to be somehow amended anyway (like the overlooked VCF).
Now regarding the OC_F_BUSY
semantics I understood, resulting in changes in this patch series, they are the following:
HSH_Insert()
, and trigger a rushvcl_synth
In the last case, the moment we know we won't trigger a fetch task there is no reason (besides hurting latency to sell more licenses) to keep the OC_F_BUSY
flag. Likewise, when a fetch task fails, the object should be "unbusied" immediately (we just got the cacheability outcome) instead of waiting for the last objcore ref to be dropped.
I hope this answers all of your questions, thank you for your patience!
FTR, need to devour this:
(20:07:05) slink: done reading once, I will need to get back to it again. There is still something about your "guaranteed to lead to a cache hit" argument which I do not get, but I am having trouble phrasing it. Maybe this regarding "It does so as far as compatible variants go": How do we know variants are compatible before doing a vary compare? How can we do better than wait for a response and then wake up waiters? Sure, we could check the waiters vary before waking them up, but that moves the burden of the vary check to a single thread vs. multiple, and if your patch does this, I must have overlooked it. (20:13:20) dridi: also (20:13:43) dridi: the deferred rush match happens outside of the objhead lock (20:14:03) dridi: so i think it should remain the way i did it (20:14:57) dridi: i suppose "guaranteed cache hit" should always be followed by "modulus compatible variant" (20:15:44) dridi: i understand your confusion (20:16:04) dridi: i still need to amend the rush match to bail out in the presence of a cat flap (20:16:41) dridi: to answer your question about doing it better (20:16:56) dridi: we could have some kind of predictive vary in the form of a hash (20:17:24) dridi: some precomputed vary outcome (20:19:43) dridi: still, i wouldn't check it in hsh_rush1() under the oh lock
Yesterday I rewrote the patch series without the offending refactoring, and taking changes in a slightly different order helped better lay out the moving parts.
I tried to avoid confusing wording in the commit log, and tried to be more explicit about semantics, so I hope this will be overall easier to review.
FTR, I understand you are waiting for my feedback, but this PR need a longer attention span and I need to tend to other work for the remainder of this week. Sorry.
My goal was to answer everything before you'd take another dive, so in that sense I succeeded ;)
I added 6 SQUASHME commits to address @mbgrydeland's review and one more to add an assertion.
I still did not find the hours which this PR requires for another review round, but I just tried it with one of my test setups and ran into this null pointer deref straight away:
#6 0x000055f99a94b07b in child_signal_handler (s=11, si=0x7f048982eaf0, c=0x7f048982e9c0) at cache/cache_main.c:326
buf = "Signal 11 (Segmentation fault) received at 0x58 si_code 1", '\000' <repeats 966 times>
sa = {__sigaction_handler = {sa_handler = 0x0, sa_sigaction = 0x0}, sa_mask = {__val = {0 <repeats 16 times>}}, sa_flags = 0, sa_restorer = 0x0}
req = 0x7f0476664a60
a = 0x58 <error: Cannot access memory at address 0x58>
p = 0x7f0480d903f8 "8\003ـ\004\177"
info = 0x0
#7 <signal handler called>
No locals.
#8 0x000055f99a93dab2 in HSH_Lookup (req=0x7f0476664a60, ocp=0x7f0480d8f670, bocp=0x7f0480d8f668) at cache/cache_hash.c:435
wrk = 0x7f0480d90338
oh = 0x0
oc = 0x7f0468a03880
exp_oc = 0x7f0480d8f620
vr = 0x7f0476664bb8
exp_t_origin = 0
busy_found = 1
vary = 0x7f0480d8f640 "\220\366\004\177"
boc_progress = -1
xid = 0
dttl = 0
#9 0x000055f99a95d470 in cnt_lookup (wrk=0x7f0480d90338, req=0x7f0476664a60) at cache/cache_req_fsm.c:578
oc = 0x7f0468a03880
busy = 0x0
lr = HSH_BUSY
had_objcore = 1
#10 0x000055f99a959d07 in CNT_Request (req=0x7f0476664a60) at cache/cache_req_fsm.c:1192
ctx = {{magic = 2161702624, syntax = 32516, vclver = 2594048900, method = 22009, vpi = 0x7f0480d903b8, msg = 0x7f0476201010, vsl = 0x5800000000, vcl = 0x7f0476200f60,
ws = 0x7f0480d8f710, sp = 0x55f99a95f747 <ses_get_attr+343>, req = 0x7f0480d8f740, http_req = 0x800000094, http_req_top = 0x7f0476200f20, http_resp = 0x128, bo = 0x7f0480d8f730,
http_bereq = 0x55f99a960327 <SES_Get_proto_priv+39>, http_beresp = 0x7f0480d8f740, now = 6.8999794284505472e-310, specific = 0x7f0480d8f750,
called = 0x55f99a99d16e <http1_getstate+30>}}
wrk = 0x7f0480d90338
nxt = REQ_FSM_MORE
#11 0x000055f99a99ce0a in HTTP1_Session (wrk=0x7f0480d90338, req=0x7f0476664a60) at http1/cache_http1_fsm.c:392
hs = 22009
sp = 0x7f0476200f20
st = 0x55f99aa51e5e <H1PROC> "HTTP1::Proc"
i = 32516
#12 0x000055f99a99c350 in http1_req (wrk=0x7f0480d90338, arg=0x7f0476664a60) at http1/cache_http1_fsm.c:87
req = 0x7f0476664a60
#13 0x000055f99a98d382 in Pool_Work_Thread (pp=0x7f0482200000, wrk=0x7f0480d90338) at cache/cache_wrk.c:496
tp = 0x7f0480d8f858
tpx = {list = {vtqe_next = 0x7f0480d51360, vtqe_prev = 0x7f0482200078}, func = 0x55f99a99c1c0 <http1_req>, priv = 0x7f0476664a60}
tps = {list = {vtqe_next = 0x0, vtqe_prev = 0x0}, func = 0x55f99a953320 <pool_stat_summ>, priv = 0x7f0482205000}
tmo = 0
now = 1707760746.3839386
i = 5
reserve = 6
...
(gdb) fr 8
#8 0x000055f99a93dab2 in HSH_Lookup (req=0x7f0476664a60, ocp=0x7f0480d8f670, bocp=0x7f0480d8f668) at cache/cache_hash.c:435
435 boc_progress = oc->boc == NULL ? -1 : oc->boc->fetched_so_far;
It is embarrassing to find that I forgot to query the boc
under the oh
lock, especially since we discussed it when the construct was first introduced (see https://github.com/varnishcache/varnish-cache/pull/3566#issuecomment-874208586). At the very least we found out quickly, thanks!
Taken care of in https://github.com/varnishcache/varnish-cache/pull/4032/commits/141308d719858842b9b3d527d5c7a280ead5c482, could you please give it another try?
Thanks to @simonvik who is helping me getting traffic through this pull request we found that a test case for vmod_saintmode would panic. It became quickly obvious to me that this case that slipped through the cracks shows that the model for this pull request is coherent.
It's the new HSH_Withdraw()
building block that solves the problem, and in hindsight it's obvious that an objcore is effectively being withdrawn.
The one-liner fix is in cf932ca620f2a690eb0d2e5470560ef2103e7871 and coverage was added directly to the master branch in 3eeb8c4f5014e2abcbc87292b876f40b0b1f09e8.
FYI: For a multitude of reasons I still did not find the few hours I would like to have for quiet work to thoroughly review the new state of this PR. I could spend about half an hour with it, and I currently feel both under- and overwhelmed by it. Underwhelmed, because I still fail to see why the goals of this PR need so many changes (I see this as my own mistake), and I am overwhelmed by the fact that at this point the PR can effectively only be reviewed as one large diff due to the incremental changes.
I am still worried about the change of the busy state model and the general impact this might have, I wonder if we really need the withdrawn facility and I am not convinced that we actually DTRT in all the different places.
Just in case I will be away for more before-release bugwashing: I would really prefer to move this one to the next cycle to get the chance for a thorough review. See previous comments for reasons.
I fixed an incorrect assertion that surfaced from production traffic. It otherwise ran successfully on Friday and over the weekend.
From Friday's bugwash: I will squash all the commits that followed @mbgrydeland's review, close this pull request and open a new one based on this patch series to start a new discussion.
From the main commit message:
This is a mix bag of cherry-picks from #3992 and new original commits. See individual commit messages for more details.