Closed original-brownbear closed 2 months ago
Pinging @elastic/es-search (Team:Search)
Doing the check less often seems ok (if easy to do without losing too much promptness) but it does look like there's some easy optimisations to the check itself. The flamegraph indicates that most of the time is just iterating the HashSet
? Does this need to be a HashSet
? We fill it with lambdas, there's not going to be any duplicates in there are there?
Can you share an expanded flamegraph for the check itself?
Here you go I don't think there's much you can optimize here. I think we could move to an ArrayList
for the data structure for sure (as you say it's only lambdas going in there so there's never any equality anyway), but if you look at the graph everything is inlining 100% already. The cost really isn't in the code but in the fetching of the data I'd guess (could check that later). And really, even if we optimised this massively by say 80%, it's still too much overhead IMO.
I bet this could be refactored in a way that makes this check a single volatile read (but at quite some cost in work and come complexity) by running the actual check on another thread for timeouts and pushing flag cancellation checks via a callback mechanism
... but really, it seems more reasonable to just find a way to not check in hot loops :) That should be less work and give a much more predictable performance return.
... that said, let me open a PR for the list real quick, better to save 2% now than not :P
Hmm ensureNotCancelled
shouldn't be that expensive either, see #104305.
@original-brownbear can we close this? Or is there still an issue here?
Pinging @elastic/es-search-foundations (Team:Search Foundations)
@benwtrent it's cheaper now but still quite visible in profiling last time I checked, I believe this shouldn't show up at all ideally. I'll try to remember updating the flame graph so people can understand the priority better.
Jup this can be closed, the cancellation check isn't even visible in flame graphs anymore :)
From profiling the scroll phase of the http_logs rally track:
These checks are far more expensive than the docCount method call itself. We also have other code where we call
ExitablePointValues
back-to-back to e.g. compare it'ssize()
andgetDocCount()
return, thus running redundant checks.We could optimise the checks itself, but like we did elsewhere it's probably better to trade cancellation liveness for performance here?