Open johntdyer opened 9 years ago
@johntdyer you can see why the shard won't allocate to the node in the output of the reroute command, specifically this line:
NO(too many shards for this index on node [2], limit: [2])
This comes from the index.routing.allocation.total_shards_per_node
setting (looks like it's been set to 2).
In the future however, issues like this should be opened on the discussion forums instead of as issues here.
Lee,
Why is this only effecting this shard? All the other shards reassigned after the rolling restart. My problem seems limited to just this single shard of this single index.
John
Sent from my iPhone
On Jul 15, 2015, at 1:57 PM, Lee Hinman notifications@github.com wrote:
@johntdyer you can see why the shard won't allocate to the node in the output of the reroute command, specifically this line:
NO(too many shards for this index on node [2], limit: [2]) This comes from the index.routing.allocation.total_shards_per_node setting (looks like it's been set to 2).
In the future however, issues like this should be opened on the discussion forums instead of as issues here.
— Reply to this email directly or view it on GitHub.
Why is this only effecting this shard? All the other shards reassigned after the rolling restart.
It looks like each of your nodes already has the maximum 2 shards for this index (the setting from above). This shard happened to be last and thus won't be allocated. You need to increase the total_shards_per_node setting, or add another node.
@dakrone - I am sorry for my naivety on this but I am still confused.... I can see under routing_table.indices.logstash-cdr-2015.05.18.shards
that shard 5 is not assigned
{
"index": "logstash-cdr-2015.05.18",
"shard": 5,
"relocating_node": null,
"node": "DnCwjImuRFOsranelYuOaw",
"primary": true,
"state": "STARTED"
},
{
"index": "logstash-cdr-2015.05.18",
"shard": 5,
"relocating_node": null,
"node": null,
"primary": false,
"state": "UNASSIGNED"
}
``
and it is still not clear to me why this is only happening with this one index, and only happened after an upgrade from 1.5.0 to 1.6.0...
Digging a little deeper here:
You have six nodes:
Each of these nodes has two shards on it, except for one:
Node | shard 1 | shard 2 |
---|---|---|
Ts0HJNFvSGy2JVd31VlotQ | `logstash-cdr-2015.05.18[1][r]` | `logstash-cdr-2015.05.18[2][r]` |
6AS8BMZKQkivehCUWANRdQ | `logstash-cdr-2015.05.18[3][p]` | `logstash-cdr-2015.05.18[1][p]` |
6fs0j8RWQ2esU7wgvAPcdg | `logstash-cdr-2015.05.18[4][r]` | `logstash-cdr-2015.05.18[2][p]` |
srLX4NZDTIaHq9qBVsxcZw | `logstash-cdr-2015.05.18[0][p]` | `logstash-cdr-2015.05.18[3][r]` |
DnCwjImuRFOsranelYuOaw | `logstash-cdr-2015.05.18[5][p]` | |
3ZOu2V5xSX-BxL2Osd5l7A | `logstash-cdr-2015.05.18[4][p]` | `logstash-cdr-2015.05.18[0][r]` |
The unassigned shard is logstash-cdr-2015.05.18[5][r]
Usually, this should would be assigned to DnCwjImuRFOsranelYuOaw
, however, you
can see in the output of the reroute why it is not:
{
"error": "ElasticsearchIllegalArgumentException[[allocate] allocation of [logstash-cdr-2015.05.18][5] on node [ls2-es5.int.tropo.com][DnCwjImuRFOsranelYuOaw][ls2-es5][inet[/10.1.0.55:9300]]{master=false} is not allowed, reason: [NO(shard cannot be allocated on same node [DnCwjImuRFOsranelYuOaw] it already exists on)][YES(node passes include/exclude/require filters)][YES(primary is already active)][YES(below shard recovery limit of [2])][YES(allocation disabling is ignored)][YES(allocation disabling is ignored)][YES(no allocation awareness enabled)][YES(shard count under limit [2] of total shards per node)][YES(target node version [1.6.0] is same or newer than source node version [1.6.0])][YES(enough disk for shard on node, free: [466.9gb])][YES(shard not primary or relocation disabled)]]",
"status": 400
}
Specifically this line:
NO(shard cannot be allocated on same node [DnCwjImuRFOsranelYuOaw] it already exists on)
This is because the primary for that shard already exists on the "DnCw" node, so ES cannot assign the replica to the node.
Additionally, Elasticsearch will not rebalance the shards on the other nodes
until all UNASSIGNED
shards are assigned, so they will not move.
_Elasticsearch is stuck in a state waiting for space to allocate the unassigned shard because it cannot assign it to the only node with space._\ So from the perspective of Elasticsearch, the shard cannot be allocated anywhere, which is why it is unassigned.
For a temporary workaround, there are two options:
index.routing.allocation.total_shards_per_node
to 3
for
this index, then set it back to 2
after this shard has been allocated.Elasticsearch should be able to assign this to one of the other nodes if you
increment this to 3, then re-balance to equalize the number of shards per node.
Once it's been allocated, you can lower it back to 2
logstash-cdr-2015.05.18[5][p]
shard on the DnCw...
node
with another node using the reroute APIIf you swap it with another shard, ES will be able to allocate the unassigned shard because it is not the same exact shard being allocated on the same node.
Why did this happen with the 1.6.0 upgrade? It is a by-product of the nodes being restarted, and bad luck for the allocation of shard here.
I think we can consider this ticket a bug report for this behavior, as we should try as hard as possible to prevent it!
The total_shards_per_node
setting is documented (in master: https://www.elastic.co/guide/en/elasticsearch/reference/master/allocation-total-shards.html) to sometimes cause unassigned shards.
It's a hard limit in a process which, for the most part, relies on heuristics and, as such, is a bit of a hack. I'd prefer to remove the setting and instead solve the problem by trying harder to spread out shards from the same index. See #12279 for more on this.
That is a dangerous one to remove. It's one of the most important parts of keeping the Wikimedia cluster up and running smoothly. The hard limit it provides is useful because putting two enwiki shards next to each other will bring the node down. On Jul 17, 2015 7:52 AM, "Clinton Gormley" notifications@github.com wrote:
The total_shards_per_node setting is documented (in master: https://www.elastic.co/guide/en/elasticsearch/reference/master/allocation-total-shards.html) to sometimes cause unassigned shards.
It's a hard limit in a process which, for the most part, relies on heuristics and, as such, is a bit of a hack. I'd prefer to remove the setting and instead solve the problem by trying harder to spread out shards from the same index. See #12279 https://github.com/elastic/elasticsearch/issues/12279 for more on this.
— Reply to this email directly or view it on GitHub https://github.com/elastic/elasticsearch/issues/12273#issuecomment-122254883 .
@nik9000 i'm only proposing removing it if we support a better option that doesn't suffer from the same issues.
@clintongormley I was able to repro this using an index (configured to require box_type="hot") with a single shard and a cluster with a single valid node (with box_type="hot"). I used index.routing.allocation.require.total_shards_per_node=1... The shard was basically stuck in the UNASSIGNED state indefinitely and the index was red. (version 5.4.0). I also had a master node (data was disabled on it) and 2 other nodes with (box_type="warm").
TLDR: Removing the index.routing.allocation.require.total_shards_per_node=1
requirement fixed it, even though the configuration should have been valid because my index only had 1 shard.
EDIT PEBCAK https://en.wiktionary.org/wiki/PEBCAK The actual property name is index.routing.allocation.total_shards_per_node
@robert-blankenship what did the allocation explain API say when the shard was unassigned?
It looks to me like you used a setting that does not exist as such. The setting is index.routing.allocation.total_shards_per_node
and not index.routing.allocation.require.total_shards_per_node
. What you specified was a requires clause (see allocation filtering) with a custom attribute total_shards_per_node
(coincidentally having the same name as the total_shards_per_node
setting) specifying that only nodes that have the custom total_shards_per_node
attribute set to 1
should have a shard of this index.
The problem you had looks to me unrelated to the original issue here.
You're right, thanks @ywelsch !
Curious if there is any movement on this as I just ran into the issue in 6.3.2
Pinging @elastic/es-distributed
Curious if there is any movement on this as I just ran into the issue in 6.3.2
This is a fundamental problem with the current shard balancer and something that cannot be easily addressed in the current implementation. The current implementation uses an incremental approach to balancing that focuses on speed, but can sometimes end up in local minima. Our general advice is to avoid over-constraining the allocation settings. That said, we're considering alternatives to the current balancer, but these are all only at the research stage yet.
I wonder if this is still issue after 8 years of being reported and known.
Just tested with recent ES version 8.10.2, completely default settings.
Easily reproducable on 3 nodes cluster. It almost always fails to allocate last shard when using these index settings:
PUT test_index
{
"settings": {
"number_of_shards": 3,
"number_of_replicas": 1,
"routing": {
"allocation": {
"total_shards_per_node": 2
}
}
}
}
It's clearly visible that correct allocation is possible - just move shard n1 to es02 and then shard n2 to es01.
Pinging @elastic/es-distributed (Team:Distributed)
Shared 5 will not get assigned after an upgrade from 1.5.0 to 1.6.0.
I tried to force a re-route w/ the following script but it didnt work
this is the only unassigned shared since the restart and I am not sure how to get it back to green. Any advice ?
Thanks