Drain scripts waiting for green cluster status are dangerous

logsearch / logsearch-boshrelease

A BOSH-scalable Elasticsearch+Logstash+Kibana release

http://www.logsearch.io

Apache License 2.0

57 stars 46 forks source link

Drain scripts waiting for green cluster status are dangerous #209

Closed voelzmo closed 8 years ago

voelzmo commented 8 years ago

I recently discovered that you added drain scripts for elasticsearch at some point in time. How they're implemented is kind of dangerous, and here is why:

your drain scripts wait for a green elasticsearch cluster status
the agent executes drain scripts forever, there is no timeout

So guess what happens when you want to repair a broken elasticsearch cluster with a bosh deploy? The drain hangs forever, nothing moves.

I know (now) that I can disable the drain scripts from the manifest. That didn't really do it for me because I only realized it when I had it already deployed. I know I can do bosh stop --skip-drain and then re-deploy, which I eventually did.

I'm not sure which cases you're covering with the scripts, maybe a scale-down without dataloss? Seems like the current idea of the drain has some unwanted side-effects you want to think about or document thoroughly.

ghost commented 8 years ago

To add to @voelzmo , it seems also that the drain script disables allocations, which are reenabled only once a node is back up (in the ctl script). This means that if you downscale the cluster (where the node will not return) and are not aware of this, the cluster will remain locked and unusable.

I believe the idea behind this script was to prevent a massive rellocation of data during an update (e.g. stemcell update) of the cluster.

cromega commented 8 years ago

@voelzmo, @momchil-sap: we are ware of the problems.

We completely reworked the data node upgrade process, the changes are waiting for a new feature to be added to BOSH which is expected to be released with one of the next few releases, we are hoping within a few weeks.

voelzmo commented 8 years ago

@cromega Thanks for explaining the rationale behind this. Out of curiosity: which feature would that be that you're waiting for?

cromega commented 8 years ago

@voelzmo: the ability to run post-deploy scripts

voelzmo commented 8 years ago

Ah nice, that is on the develop branch already, so the next release should probably have it :)

mrdavidlaing commented 8 years ago

@voelzmo - The next release - v201 - will

(a) Disable shard allocations at the beginning of a deploy (b) Have an errand - bosh run errand enable-shard-allocation to re-enable then after a deploy has finished

As soon as BOSH releases with post-deploy scripts, (b) will be turned into a post-deploy scripts.

Hope that helps; and thanks for your thoughts!

dpb587 commented 8 years ago

Does fully disabling allocations prevent indices from being created if log messages with new dates/times come through, potentially causing data loss until re-enabling?

On Thursday, March 3, 2016, David Laing notifications@github.com wrote:

@voelzmo https://github.com/voelzmo - The next release - v201 - will

(a) Disable shard allocations at the beginning of a deploy (b) Have an errand - bosh run errand enable-shard-allocation to re-enable then after a deploy has finished

As soon as BOSH releases with post-deploy scripts, (b) will be turned into a post-deploy scripts.

Hope that helps; and thanks for your thoughts!

— Reply to this email directly or view it on GitHub https://github.com/logsearch/logsearch-boshrelease/issues/209#issuecomment-191841450 .

Danny Berger https://dpb587.me

mrdavidlaing commented 8 years ago

@dpb587 - Bummer; you are right.

During a deploy, attempts to create new indicies will fail, eg:

curl -XPOST "http://localhost:9200/new_index/my_type" -d "{ \"field\" : \"value\"}"
{"error":{"root_cause":[{"type":"unavailable_shards_exception","reason":"[new_index][3] primary shard is not active Timeout: [1m], request: [index {[new_index][my_type][AVM9gHBuL3eDRI8kcjes], source[{ \"field\" : \"value\"}]}]"}],"type":"unavailable_shards_exception","reason":"[new_index][3] primary shard is not active Timeout: [1m], request: [index {[new_index][my_type][AVM9gHBuL3eDRI8kcjes], source[{ \"field\" : \"value\"}]}]"},"status":503}%

Indexing to existing indices continue to work during the deploy, as do searches.

@cromega, @ablease - thoughts?

ablease commented 8 years ago

We can experiment with disabling replica allocations only. I have a feeling that will impact recovery time after a deploy on larger clusters though.

I think there will be a trade off that comes down to how people manage their clusters. If you have big indices, then you want to disable shard allocation. If you have small indices and create many per day, then you might be able to live without disabling shard allocation at all...