elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
1.01k stars 24.82k forks source link

update by query and max_docs and conficts=proceed #63671

Closed nik9000 closed 3 years ago

nik9000 commented 4 years ago

https://github.com/elastic/kibana/issues/80371 seems to be having a problem with update_by_query and max_docs. I'm not 100% sure exactly what is going on, but they expect the combination of max_docs=10 and conflicts=proceed to continue running update by query until it manages to update 10 documents. This seems pretty reasonable and it seems like something the code is trying to do. I might have written that code, but I honestly don't remember at this point. Any way, it looks to me like we at least have an issue where we'll only attempt the first max_docs of each bulk response. Which would be fine without conflicts=proceed, but with it we probably should check if we should move on to the next few docs in the bulk response.

elasticmachine commented 4 years ago

Pinging @elastic/es-distributed (:Distributed/Reindex)

henningandersen commented 4 years ago

I agree, there looks to be an issue when we truncate the hits due to max_docs with the semantics you describe.

I am a bit torn on what the right semantics should be though. I think the choice is between:

  1. max_docs relate primarily to the search/source and thus we should stop after having received that many hits, regardless of version conflicts.
  2. max_docs relate primarily to the actual updates/real work and thus we should stop only after having done that many updates.

My intuition was towards the 1st option before reading your description here. I am curious if you have more input to this?

I believe the kibana issue should resolve itself, since I assume they retry and run that update periodically and the version conflicts should then no longer occur since the tasks that conflicted on the previous run would no longer appear in the search result on a second run?

nik9000 commented 4 years ago

I think the second choice is what we meant to do when implementing it. I think I made a mistake when I implemented it and sort of bumbled into doing the first thing instead.

nik9000 commented 4 years ago

I believe the kibana issue should resolve itself, since I assume they retry and run that update periodically and the version conflicts should then no longer occur since the tasks that conflicted on the previous run would no longer appear in the search result on a second run?

I think it will indeed resolve itself without us. They have some other issue around sorting by score that mean they are a bit stuck, but i think that is not related to this.

gmmorris commented 3 years ago

Thanks for looking into this y'all!

I believe the kibana issue should resolve itself, since I assume they retry and run that update periodically and the version conflicts should then no longer occur since the tasks that conflicted on the previous run would no longer appear in the search result on a second run?

It's an interesting issue, as it has different effects in different situations. For the most part, this does resolve itself, but where this becomes problematic for us is when we fall behind on the work needed. We use updateByQuery as part of our Task Management mechanism, which all Kibana Alerts rely on. Our goal is for this mechanism to scale horizontally, but we're currently hitting a glass ceiling when we have too many Kibana instances running in parallel, and their updateByQuery calls result in a growing number of version_conflicts.

Long term we're hoping to solve this by adding coordination between Kibana nodes, so that they can split the work in a more reliable and sustainable fashion. Sadly though, we're still quite far from that (as Kibana instances aren't aware of each other atm), and have to rely on ES to coordinate. When version_conflict count against the updated we end up with wasted cycles where multiple Kibana just skip an entire task claiming cycle as they experience consistent 100% version_conflict and 0% updated. We're addressing this by reshuffling Kibana instances so that they try to poll without clashing, but it's more like a clumsy Hokey Pokey than a well coordinated ballet ;)

  1. max_docs relate primarily to the actual updates/real work and thus we should stop only after having done that many updates.

This would definitely be our preferred approach, but we only represent one POV, and I can totally see the 1st also being valid. I'd hate to add even more to the Api, but perhaps this can be configurable?

bmcconaghy commented 3 years ago

I think it would violate the principle of least astonishment if there are cases where a given update by query call could have updated max_docs documents but does not because of conflicts. So if this can happen as the code currently stands, I consider that to be a bug.

gmmorris commented 3 years ago

Hey team, I just wanted to share some context about the impact of this bug - hopefully this can help you evaluate the value of addressing this issue in the future. :)

We have now merged a mechanism into Task Manager which essentially tries to shift the time at which a Kibana node polls for tasks if it's experiencing a high rate of version_conflicts. We found this reduced the impact of this bug as the Task Manager instances end up spacing themselves out so that they clash less.

There are two things to keep in mind about this solution:

  1. It doesn't actually address the problem, as we still end up with wasted cycles due to conflicts until instances realign. This repeat every time we shift, which can happen multiple times when there are many instances.
  2. We now have a glass ceiling beyond which we can't scale horizontally, as this coordination has its limits. In effect this means that features like Alerting, Reporting etc. can't currently scale beyond a certain point (though, we have been able to run 64 Kibana instances in parallel, so it isn't that bad ;) ).

Longer term we hope to introduce some smarter coordination (such as long term ownership of tasks reducing conflicts, or even a leader node that coordinates the work), but these are still far off.

Addressing this bug should reduce the conflicts we experience, enabling us to push that glass ceiling higher.... which is why addressing this would be very valuable to us.

I hope this helps :)

fcofdez commented 3 years ago

I've been taking a look into this and I have a couple of tests that reproduce the issue. As @henningandersen pointed out, one of the problems is that we're only taking into account the first max_docs documents for each scroll response, for example with scroll_size=1000 and max_docs=10 we leave out 990 docs, meaning that a big portion of the matching documents aren't taken into account. A solution for this problem might be to keep the matching documents around, but this might be problematic if those documents are big, maybe we can just keep around the doc ids and fetch those?

A second scenario that leads to this problem is when all the matching documents are updated concurrently, this is trickier to solve, we could retry the search request, but AFAIK we don't have any guarantee that the process would make progress in scenarios with high contention.

I'm not sure if we want to tackle the second scenario, wdyt @henningandersen?

henningandersen commented 3 years ago

@fcofdez about the first scenario, I believe we already fetch the docs from source - keeping them around for a while does not seem problematic (in that case they should lower the scroll size).

About the second scenario, I think we should not repeat the search. We should do one scroll search only and if we get to the end we are done. The "guarantee" we give is that we try to process docs that were present when we receive the operation. But any docs that appear concurrently with the update by query are not guaranteed to be included. We can therefore safely return if we get to the end. The client will not know the difference between whether extra docs appeared before or after ES sent back the response anyway.

Repeating the search could lead to seeing the same docs multiple times and if the update script is not idempotent (like a counter or add to an array), that could accumulate infinitely.