Closed voelzmo closed 8 years ago
To add to @voelzmo , it seems also that the drain script disables allocations, which are reenabled only once a node is back up (in the ctl script). This means that if you downscale the cluster (where the node will not return) and are not aware of this, the cluster will remain locked and unusable.
I believe the idea behind this script was to prevent a massive rellocation of data during an update (e.g. stemcell update) of the cluster.
@voelzmo, @momchil-sap: we are ware of the problems.
We completely reworked the data node upgrade process, the changes are waiting for a new feature to be added to BOSH which is expected to be released with one of the next few releases, we are hoping within a few weeks.
@cromega Thanks for explaining the rationale behind this. Out of curiosity: which feature would that be that you're waiting for?
@voelzmo: the ability to run post-deploy scripts
Ah nice, that is on the develop
branch already, so the next release should probably have it :)
@voelzmo - The next release - v201 - will
(a) Disable shard allocations at the beginning of a deploy
(b) Have an errand - bosh run errand enable-shard-allocation
to re-enable then after a deploy has finished
As soon as BOSH releases with post-deploy scripts, (b) will be turned into a post-deploy scripts.
Hope that helps; and thanks for your thoughts!
Does fully disabling allocations prevent indices from being created if log messages with new dates/times come through, potentially causing data loss until re-enabling?
On Thursday, March 3, 2016, David Laing notifications@github.com wrote:
@voelzmo https://github.com/voelzmo - The next release - v201 - will
(a) Disable shard allocations at the beginning of a deploy (b) Have an errand - bosh run errand enable-shard-allocation to re-enable then after a deploy has finished
As soon as BOSH releases with post-deploy scripts, (b) will be turned into a post-deploy scripts.
Hope that helps; and thanks for your thoughts!
— Reply to this email directly or view it on GitHub https://github.com/logsearch/logsearch-boshrelease/issues/209#issuecomment-191841450 .
Danny Berger https://dpb587.me
@dpb587 - Bummer; you are right.
During a deploy, attempts to create new indicies will fail, eg:
curl -XPOST "http://localhost:9200/new_index/my_type" -d "{ \"field\" : \"value\"}"
{"error":{"root_cause":[{"type":"unavailable_shards_exception","reason":"[new_index][3] primary shard is not active Timeout: [1m], request: [index {[new_index][my_type][AVM9gHBuL3eDRI8kcjes], source[{ \"field\" : \"value\"}]}]"}],"type":"unavailable_shards_exception","reason":"[new_index][3] primary shard is not active Timeout: [1m], request: [index {[new_index][my_type][AVM9gHBuL3eDRI8kcjes], source[{ \"field\" : \"value\"}]}]"},"status":503}%
Indexing to existing indices continue to work during the deploy, as do searches.
@cromega, @ablease - thoughts?
We can experiment with disabling replica allocations only. I have a feeling that will impact recovery time after a deploy on larger clusters though.
I think there will be a trade off that comes down to how people manage their clusters. If you have big indices, then you want to disable shard allocation. If you have small indices and create many per day, then you might be able to live without disabling shard allocation at all...
I recently discovered that you added drain scripts for elasticsearch at some point in time. How they're implemented is kind of dangerous, and here is why:
green
elasticsearch cluster statusSo guess what happens when you want to repair a broken elasticsearch cluster with a
bosh deploy
? The drain hangs forever, nothing moves.I know (now) that I can disable the drain scripts from the manifest. That didn't really do it for me because I only realized it when I had it already deployed. I know I can do
bosh stop --skip-drain
and then re-deploy, which I eventually did.I'm not sure which cases you're covering with the scripts, maybe a scale-down without dataloss? Seems like the current idea of the drain has some unwanted side-effects you want to think about or document thoroughly.