elastic / elasticsearch

Free and Open, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
69.53k stars 24.61k forks source link

Force merge should be cancellable #17094

Open nik9000 opened 8 years ago

nik9000 commented 8 years ago

We have the infrastructure to make long running stuff cancellable. Force merge seems like something that you might want to cancel.

Cancel doesn't have to immediately cancel (can't/shouldn't kill threads in Java), just make a reasonably good effort to cancel the task.

s1monw commented 8 years ago

how would that work? you can't just go and stop the merge once it's kicked off - I think having an API to do that is misleading and will be disappointing to users

nik9000 commented 8 years ago

how would that work? you can't just go and stop the merge once it's kicked off - I think having an API to do that is misleading and will be disappointing to users

That is why I made this a discussion! I haven't read much of that code but it feels like an "obvious" use for cancellable tasks. If it is hard to impossible at least we'll have this issue we can point people to when they ask for it.

pickypg commented 8 years ago

Perhaps it can short circuit? Since force merge is actually a series of merges in the background, maybe it can just stop in the middle if possible? Then perhaps we can augment the API to state where it is so that it can succeed / fail?

kuipertan commented 8 years ago

I need this api.

Once start it , it takes long time to finish the merging work.

geekpete commented 7 years ago

So it could finish the current segment merges but not start any new segment merges? When segments are merged, the new ones are written out and when confirmed as completed, the old ones are then removed. This is why the disk usage grows then shrinks again. So a way to cancel the current/in-progress merges and reinstate the existing segments that were about to be thrown away seems like what you'd want it to do?

And managing it with the tasks api would be nice.

elasticmachine commented 6 years ago

Pinging @elastic/es-core-infra

tomcallahan commented 6 years ago

I'm afraid we don't have the underlying infrastructure to be able to accomplish this - see #15975

DaveCTurner commented 3 years ago

I think we should reconsider this. Long-running force-merge tasks are something that still comes up quite often as a supportability concern and although we don't have the infrastructure to promptly cancel a merge that's running on a shard we can at least avoid starting work on subsequent shards targeted by the same request. I believe that plus some slightly better progress information (https://github.com/elastic/elasticsearch/issues/15975#issuecomment-838004271) plus feedback when a task is cancelled (https://github.com/elastic/elasticsearch/issues/72907) (plus a few lines in the docs) would solve some issues and for the remainder it would at least improve our understanding of the problem.

yukha-dw commented 1 month ago

I think it'd be very useful when we don't have enough storage anymore. Stopping merge process early will prevent disaster