Open iverase opened 9 months ago
Pinging @elastic/es-distributed (Team:Distributed)
Oh, interesting, according to:
"TEST-DiskThresholdDeciderIT.testRestoreSnapshotAllocationDoesNotExceedWatermarkWithMultipleShards-seed#[ED5C019BB52E4B53]" ID=2402 WAITING on org.elasticsearch.action.support.PlainActionFuture$Sync@dcadc57
at java.base@21.0.2/jdk.internal.misc.Unsafe.park(Native Method)
- waiting on org.elasticsearch.action.support.PlainActionFuture$Sync@dcadc57
at java.base@21.0.2/java.util.concurrent.locks.LockSupport.park(LockSupport.java:221)
at java.base@21.0.2/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:754)
at java.base@21.0.2/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1099)
at app//org.elasticsearch.action.support.PlainActionFuture$Sync.get(PlainActionFuture.java:273)
at app//org.elasticsearch.action.support.PlainActionFuture.get(PlainActionFuture.java:93)
at app//org.elasticsearch.client.internal.support.AbstractClient$RefCountedFuture.get(AbstractClient.java:1534)
at app//org.elasticsearch.client.internal.support.AbstractClient$RefCountedFuture.get(AbstractClient.java:1514)
at app//org.elasticsearch.common.util.concurrent.FutureUtils.get(FutureUtils.java:45)
at app//org.elasticsearch.action.support.PlainActionFuture.actionGet(PlainActionFuture.java:152)
at app//org.elasticsearch.action.ActionRequestBuilder.get(ActionRequestBuilder.java:43)
at app//org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT.testRestoreSnapshotAllocationDoesNotExceedWatermarkWithMultipleShards(DiskThresholdDeciderIT.java:204)
at java.base@21.0.2/java.lang.invoke.LambdaForm$DMH/0x00007f2a3c300000.invokeVirtual(LambdaForm$DMH)
at java.base@21.0.2/java.lang.invoke.LambdaForm$MH/0x00007f2a3c0a0800.invoke(LambdaForm$MH)
it was stuck waiting at this line:
Output also contains:
WARNING: Uncaught exception in thread: Thread[#2574,elasticsearch[node_t0][clusterApplierService#updateTask][T#1],5,TGRP-DiskThresholdDeciderIT]
java.lang.AssertionError:
Expected: <0>
but: was <1>
at __randomizedtesting.SeedInfo.seed([ED5C019BB52E4B53]:0)
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18)
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
at org.elasticsearch.test.ESTestCase.assertThat(ESTestCase.java:2119)
at org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT.lambda$testRestoreSnapshotAllocationDoesNotExceedWatermarkWithMultipleShards$2(DiskThresholdDeciderIT.java:176)
at org.elasticsearch.cluster.service.ClusterApplierService.callClusterStateListener(ClusterApplierService.java:560)
at org.elasticsearch.cluster.service.ClusterApplierService.callClusterStateListeners(ClusterApplierService.java:546)
at org.elasticsearch.cluster.service.ClusterApplierService.applyChanges(ClusterApplierService.java:505)
at org.elasticsearch.cluster.service.ClusterApplierService.runTask(ClusterApplierService.java:429)
at org.elasticsearch.cluster.service.ClusterApplierService$UpdateTask.run(ClusterApplierService.java:154)
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:917)
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:217)
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:183)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
at java.base/java.lang.Thread.run(Thread.java:1583)
so the actual failure happened at:
According to the recent failure info, the test created the following shards:
ShardSize[shardId=[btydusdduwrc][3], size=16919]
ShardSize[shardId=[btydusdduwrc][0], size=17886]
ShardSize[shardId=[btydusdduwrc][5], size=17972]
ShardSize[shardId=[btydusdduwrc][1], size=18195]
ShardSize[shardId=[btydusdduwrc][4], size=18198]
ShardSize[shardId=[btydusdduwrc][2], size=18472]
During the first allocation, only 5 shard had computed balance and were allocated accordingly:
[2024-03-03T17:22:30,358][TRACE][o.e.c.r.a.a.DesiredBalanceShardsAllocator] [node_t0] Reconciling desired balance: DesiredBalance[lastConvergedIndex=13, assignments={
[btydusdduwrc][0]=ShardAssignment[nodeIds=[CTSLnDyTQj2xDmatjmYTtQ], total=1, unassigned=0, ignored=0],
[btydusdduwrc][1]=ShardAssignment[nodeIds=[suT6FRFCTBiMDfiBCAU_nQ], total=1, unassigned=0, ignored=0],
[btydusdduwrc][4]=ShardAssignment[nodeIds=[suT6FRFCTBiMDfiBCAU_nQ], total=1, unassigned=0, ignored=0],
[btydusdduwrc][5]=ShardAssignment[nodeIds=[suT6FRFCTBiMDfiBCAU_nQ], total=1, unassigned=0, ignored=0],
[btydusdduwrc][2]=ShardAssignment[nodeIds=[suT6FRFCTBiMDfiBCAU_nQ], total=1, unassigned=0, ignored=0],
[btydusdduwrc][3]=ShardAssignment[nodeIds=[], total=1, unassigned=1, ignored=1]
}]
Notice the smallest one is ignored. Possibly as the size was still not computed from the repository.
Later the shard balance was computed as following:
[2024-03-03T17:22:30,400][TRACE][o.e.c.r.a.a.DesiredBalanceShardsAllocator] [node_t0] Desired balance updated: DesiredBalance[lastConvergedIndex=14, assignments={[btydusdduwrc][0]=ShardAssignment[nodeIds=[CTSLnDyTQj2xDmatjmYTtQ], total=1, unassigned=0, ignored=0], [btydusdduwrc][1]=ShardAssignment[nodeIds=[suT6FRFCTBiMDfiBCAU_nQ], total=1, unassigned=0, ignored=0], [btydusdduwrc][4]=ShardAssignment[nodeIds=[suT6FRFCTBiMDfiBCAU_nQ], total=1, unassigned=0, ignored=0], [btydusdduwrc][5]=ShardAssignment[nodeIds=[suT6FRFCTBiMDfiBCAU_nQ], total=1, unassigned=0, ignored=0], [btydusdduwrc][2]=ShardAssignment[nodeIds=[suT6FRFCTBiMDfiBCAU_nQ], total=1, unassigned=0, ignored=0], [btydusdduwrc][3]=ShardAssignment[nodeIds=[CTSLnDyTQj2xDmatjmYTtQ], total=1, unassigned=0, ignored=0]}]. Diff:
[btydusdduwrc][3]: ShardAssignment[nodeIds=[], total=1, unassigned=1, ignored=1] -> ShardAssignment[nodeIds=[CTSLnDyTQj2xDmatjmYTtQ], total=1, unassigned=0, ignored=0]
This suggests that the computation failed to take into account another non-empty shard (still?) initializing on the same node.
I wonder if in this case the shard has started, but corresponding information was still not available in ClusterInfo and the second allocation round was happening as if CTSLnDyTQj2xDmatjmYTtQ
is empty.
I checked the latest failure logs and they contain:
[2024-03-11T18:25:16,539][TRACE][o.e.c.r.a.a.DesiredBalanceComputer] [node_t0] Recomputing desired balance for [15]: DesiredBalance[lastConvergedIndex=14, assignments={[wggatuinx][2]=ShardAssignment[nodeIds=[], total=1, unassigned=1, ignored=1], [wggatuinx][3]=ShardAssignment[nodeIds=[], total=1, unassigned=1, ignored=1], [wggatuinx][4]=ShardAssignment[nodeIds=[], total=1, unassigned=1, ignored=1], [wggatuinx][5]=ShardAssignment[nodeIds=[], total=1, unassigned=1, ignored=1], [wggatuinx][0]=ShardAssignment[nodeIds=[CwlFGBJYSXywcA9Qqt25xg], total=1, unassigned=0, ignored=0], [wggatuinx][1]=ShardAssignment[nodeIds=[], total=1, unassigned=1, ignored=1]}], routing_nodes:
-----node_id[CwlFGBJYSXywcA9Qqt25xg][V]
--------[wggatuinx][0], node[CwlFGBJYSXywcA9Qqt25xg], [P], recovery_source[snapshot recovery [Us0WnflFQ-eNYAoK_hOaBQ] from repo:snap/gLsztBgmR1G2j8MufCwrSQ], s[INITIALIZING], a[id=lHmaALapQ1qElJRtpQAgug], unassigned_info[[reason=NEW_INDEX_RESTORED], at[2024-03-11T12:25:16.245Z], delayed=false, details[restore_source[repo/snap]], allocation_status[no_attempt]], failed_attempts[0], expected_shard_size[11386]
-----node_id[B-MnLxupT2qkXpWstJb6mA][V]
---- unassigned
--------[wggatuinx][1], node[null], [P], recovery_source[snapshot recovery [Us0WnflFQ-eNYAoK_hOaBQ] from repo:snap/gLsztBgmR1G2j8MufCwrSQ], s[UNASSIGNED], unassigned_info[[reason=NEW_INDEX_RESTORED], at[2024-03-11T12:25:16.245Z], delayed=false, details[restore_source[repo/snap]], allocation_status[no_attempt]], failed_attempts[0]
--------[wggatuinx][2], node[null], [P], recovery_source[snapshot recovery [Us0WnflFQ-eNYAoK_hOaBQ] from repo:snap/gLsztBgmR1G2j8MufCwrSQ], s[UNASSIGNED], unassigned_info[[reason=NEW_INDEX_RESTORED], at[2024-03-11T12:25:16.245Z], delayed=false, details[restore_source[repo/snap]], allocation_status[no_attempt]], failed_attempts[0]
--------[wggatuinx][3], node[null], [P], recovery_source[snapshot recovery [Us0WnflFQ-eNYAoK_hOaBQ] from repo:snap/gLsztBgmR1G2j8MufCwrSQ], s[UNASSIGNED], unassigned_info[[reason=NEW_INDEX_RESTORED], at[2024-03-11T12:25:16.245Z], delayed=false, details[restore_source[repo/snap]], allocation_status[no_attempt]], failed_attempts[0]
--------[wggatuinx][4], node[null], [P], recovery_source[snapshot recovery [Us0WnflFQ-eNYAoK_hOaBQ] from repo:snap/gLsztBgmR1G2j8MufCwrSQ], s[UNASSIGNED], unassigned_info[[reason=NEW_INDEX_RESTORED], at[2024-03-11T12:25:16.245Z], delayed=false, details[restore_source[repo/snap]], allocation_status[no_attempt]], failed_attempts[0]
--------[wggatuinx][5], node[null], [P], recovery_source[snapshot recovery [Us0WnflFQ-eNYAoK_hOaBQ] from repo:snap/gLsztBgmR1G2j8MufCwrSQ], s[UNASSIGNED], unassigned_info[[reason=NEW_INDEX_RESTORED], at[2024-03-11T12:25:16.245Z], delayed=false, details[restore_source[repo/snap]], allocation_status[no_attempt]], failed_attempts[0]
,
{
"nodes" : {
"CwlFGBJYSXywcA9Qqt25xg" : {
"node_name" : "node_t2",
"least_available" : {
"path" : "/dev/shm/bk/bk-agent-prod-gcp-1710158680974142795/elastic/elasticsearch-periodic/server/build/testrun/internalClusterTest/temp/org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT_7FA2BD7AB6A11F7E-001/tempDir-003/node-2",
"total_bytes" : 21632,
"used_bytes" : 0,
"free_bytes" : 21632,
"free_disk_percent" : 100.0,
"used_disk_percent" : 0.0
},
"most_available" : {
"path" : "/dev/shm/bk/bk-agent-prod-gcp-1710158680974142795/elastic/elasticsearch-periodic/server/build/testrun/internalClusterTest/temp/org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT_7FA2BD7AB6A11F7E-001/tempDir-003/node-2",
"total_bytes" : 21632,
"used_bytes" : 0,
"free_bytes" : 21632,
"free_disk_percent" : 100.0,
"used_disk_percent" : 0.0
}
},
"B-MnLxupT2qkXpWstJb6mA" : {
"node_name" : "node_t1",
"least_available" : {
"path" : "/dev/shm/bk/bk-agent-prod-gcp-1710158680974142795/elastic/elasticsearch-periodic/server/build/testrun/internalClusterTest/temp/org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT_7FA2BD7AB6A11F7E-001/tempDir-003/node-1",
"total_bytes" : 63302156288,
"used_bytes" : 9738104832,
"free_bytes" : 53564051456,
"free_disk_percent" : 84.6,
"used_disk_percent" : 15.4
},
"most_available" : {
"path" : "/dev/shm/bk/bk-agent-prod-gcp-1710158680974142795/elastic/elasticsearch-periodic/server/build/testrun/internalClusterTest/temp/org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT_7FA2BD7AB6A11F7E-001/tempDir-003/node-1",
"total_bytes" : 63302156288,
"used_bytes" : 9738104832,
"free_bytes" : 53564051456,
"free_disk_percent" : 84.6,
"used_disk_percent" : 15.4
}
}
},
"shard_sizes" : {
"[wggatuinx][0][p]_bytes" : 11386
},
"shard_data_set_sizes" : {
"[wggatuinx][0]_bytes" : 11386
},
"shard_paths" : {
"NodeAndShard[nodeId=CwlFGBJYSXywcA9Qqt25xg, shardId=[wggatuinx][0]]" : "/dev/shm/bk/bk-agent-prod-gcp-1710158680974142795/elastic/elasticsearch-periodic/server/build/testrun/internalClusterTest/temp/org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT_7FA2BD7AB6A11F7E-001/tempDir-003/node-2"
},
"reserved_sizes" : [
{
"node_id" : "CwlFGBJYSXywcA9Qqt25xg",
"path" : "/dev/shm/bk/bk-agent-prod-gcp-1710158680974142795/elastic/elasticsearch-periodic/server/build/testrun/internalClusterTest/temp/org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT_7FA2BD7AB6A11F7E-001/tempDir-003/node-2",
"total" : 0,
"shards" : [
"[wggatuinx][0]"
]
}
]
},
^^^ cluster info is complete and correct
SnapshotShardSizeInfo{snapshotShardSizes=[[snapshot=repo:snap/gLsztBgmR1G2j8MufCwrSQ, index=[wggatuinx/neUKsrfBR2iP0GAWlUpV2g], shard=[wggatuinx][2]]=>11469, [snapshot=repo:snap/gLsztBgmR1G2j8MufCwrSQ, index=[wggatuinx/neUKsrfBR2iP0GAWlUpV2g], shard=[wggatuinx][3]]=>11869, [snapshot=repo:snap/gLsztBgmR1G2j8MufCwrSQ, index=[wggatuinx/neUKsrfBR2iP0GAWlUpV2g], shard=[wggatuinx][4]]=>11392, [snapshot=repo:snap/gLsztBgmR1G2j8MufCwrSQ, index=[wggatuinx/neUKsrfBR2iP0GAWlUpV2g], shard=[wggatuinx][5]]=>12075, [snapshot=repo:snap/gLsztBgmR1G2j8MufCwrSQ, index=[wggatuinx/neUKsrfBR2iP0GAWlUpV2g], shard=[wggatuinx][1]]=>12389]}
^^^ all snapshot sizes are known and available. They are also available for the prior allocation round
[2024-03-11T18:25:16,540][TRACE][o.e.c.r.a.a.DesiredBalanceComputer] [node_t0] [wggatuinx][0] marked shard as started (routing: [wggatuinx][0], node[CwlFGBJYSXywcA9Qqt25xg], [P], recovery_source[snapshot recovery [Us0WnflFQ-eNYAoK_hOaBQ] from repo:snap/gLsztBgmR1G2j8MufCwrSQ], s[INITIALIZING], a[id=lHmaALapQ1qElJRtpQAgug], unassigned_info[[reason=NEW_INDEX_RESTORED], at[2024-03-11T12:25:16.245Z], delayed=false, details[restore_source[repo/snap]], allocation_status[no_attempt]], failed_attempts[0], expected_shard_size[11386])
^^^ allocated correctly in the prior round
[2024-03-11T18:25:16,540][TRACE][o.e.c.r.a.a.DesiredBalanceComputer] [node_t0] running delegate allocator
[2024-03-11T18:25:16,542][TRACE][o.e.c.r.a.a.DesiredBalanceComputer] [node_t0] [wggatuinx][4] marked shard as started (routing: [wggatuinx][4], node[CwlFGBJYSXywcA9Qqt25xg], [P], recovery_source[snapshot recovery [Us0WnflFQ-eNYAoK_hOaBQ] from repo:snap/gLsztBgmR1G2j8MufCwrSQ], s[INITIALIZING], a[id=-gFVMSNgRoeo1Fv53EIBhQ], unassigned_info[[reason=NEW_INDEX_RESTORED], at[2024-03-11T12:25:16.245Z], delayed=false, details[restore_source[repo/snap]], allocation_status[no_attempt]], failed_attempts[0], expected_shard_size[11392])
^^^ allocated incorrectly using in the balanced allocator using all available information
[2024-03-11T18:25:16,542][TRACE][o.e.c.r.a.a.DesiredBalanceComputer] [node_t0] [wggatuinx][1] marked shard as started (routing: [wggatuinx][1], node[B-MnLxupT2qkXpWstJb6mA], [P], recovery_source[snapshot recovery [Us0WnflFQ-eNYAoK_hOaBQ] from repo:snap/gLsztBgmR1G2j8MufCwrSQ], s[INITIALIZING], a[id=Ehs2RXxxTaaYu-puiIjWOg], unassigned_info[[reason=NEW_INDEX_RESTORED], at[2024-03-11T12:25:16.245Z], delayed=false, details[restore_source[repo/snap]], allocation_status[no_attempt]], failed_attempts[0], expected_shard_size[12389])
[2024-03-11T18:25:16,543][TRACE][o.e.c.r.a.a.DesiredBalanceComputer] [node_t0] [wggatuinx][2] marked shard as started (routing: [wggatuinx][2], node[B-MnLxupT2qkXpWstJb6mA], [P], recovery_source[snapshot recovery [Us0WnflFQ-eNYAoK_hOaBQ] from repo:snap/gLsztBgmR1G2j8MufCwrSQ], s[INITIALIZING], a[id=v3Yh-Ie1SEWjezhmdIiwfg], unassigned_info[[reason=NEW_INDEX_RESTORED], at[2024-03-11T12:25:16.245Z], delayed=false, details[restore_source[repo/snap]], allocation_status[no_attempt]], failed_attempts[0], expected_shard_size[11469])
[2024-03-11T18:25:16,543][TRACE][o.e.c.r.a.a.DesiredBalanceComputer] [node_t0] [wggatuinx][3] marked shard as started (routing: [wggatuinx][3], node[B-MnLxupT2qkXpWstJb6mA], [P], recovery_source[snapshot recovery [Us0WnflFQ-eNYAoK_hOaBQ] from repo:snap/gLsztBgmR1G2j8MufCwrSQ], s[INITIALIZING], a[id=wz8am3sZSGm7LteXF6GN2Q], unassigned_info[[reason=NEW_INDEX_RESTORED], at[2024-03-11T12:25:16.245Z], delayed=false, details[restore_source[repo/snap]], allocation_status[no_attempt]], failed_attempts[0], expected_shard_size[11869])
[2024-03-11T18:25:16,543][TRACE][o.e.c.r.a.a.DesiredBalanceComputer] [node_t0] [wggatuinx][5] marked shard as started (routing: [wggatuinx][5], node[B-MnLxupT2qkXpWstJb6mA], [P], recovery_source[snapshot recovery [Us0WnflFQ-eNYAoK_hOaBQ] from repo:snap/gLsztBgmR1G2j8MufCwrSQ], s[INITIALIZING], a[id=BJ7H_lu6QKyV8WvVZsaHRA], unassigned_info[[reason=NEW_INDEX_RESTORED], at[2024-03-11T12:25:16.245Z], delayed=false, details[restore_source[repo/snap]], allocation_status[no_attempt]], failed_attempts[0], expected_shard_size[12075])
[2024-03-11T18:25:16,543][TRACE][o.e.c.r.a.a.DesiredBalanceComputer] [node_t0] Desired balance computation for [15] is still not converged after [0s] and [1] iterations
[2024-03-11T18:25:16,543][TRACE][o.e.c.r.a.a.DesiredBalanceComputer] [node_t0] running delegate allocator
[2024-03-11T18:25:16,549][DEBUG][o.e.c.r.a.a.DesiredBalanceComputer] [node_t0] Desired balance computation for [15] converged after [0s] and [2] iterations
[2024-03-11T18:25:16,550][TRACE][o.e.c.r.a.a.DesiredBalanceShardsAllocator] [node_t0] Desired balance updated: DesiredBalance[lastConvergedIndex=15, assignments={[wggatuinx][2]=ShardAssignment[nodeIds=[B-MnLxupT2qkXpWstJb6mA], total=1, unassigned=0, ignored=0], [wggatuinx][3]=ShardAssignment[nodeIds=[B-MnLxupT2qkXpWstJb6mA], total=1, unassigned=0, ignored=0], [wggatuinx][4]=ShardAssignment[nodeIds=[CwlFGBJYSXywcA9Qqt25xg], total=1, unassigned=0, ignored=0], [wggatuinx][5]=ShardAssignment[nodeIds=[B-MnLxupT2qkXpWstJb6mA], total=1, unassigned=0, ignored=0], [wggatuinx][0]=ShardAssignment[nodeIds=[CwlFGBJYSXywcA9Qqt25xg], total=1, unassigned=0, ignored=0], [wggatuinx][1]=ShardAssignment[nodeIds=[B-MnLxupT2qkXpWstJb6mA], total=1, unassigned=0, ignored=0]}]. Diff:
[wggatuinx][2]: ShardAssignment[nodeIds=[], total=1, unassigned=1, ignored=1] -> ShardAssignment[nodeIds=[B-MnLxupT2qkXpWstJb6mA], total=1, unassigned=0, ignored=0]
[wggatuinx][3]: ShardAssignment[nodeIds=[], total=1, unassigned=1, ignored=1] -> ShardAssignment[nodeIds=[B-MnLxupT2qkXpWstJb6mA], total=1, unassigned=0, ignored=0]
[wggatuinx][4]: ShardAssignment[nodeIds=[], total=1, unassigned=1, ignored=1] -> ShardAssignment[nodeIds=[CwlFGBJYSXywcA9Qqt25xg], total=1, unassigned=0, ignored=0]
[wggatuinx][5]: ShardAssignment[nodeIds=[], total=1, unassigned=1, ignored=1] -> ShardAssignment[nodeIds=[B-MnLxupT2qkXpWstJb6mA], total=1, unassigned=0, ignored=0]
[wggatuinx][1]: ShardAssignment[nodeIds=[], total=1, unassigned=1, ignored=1] -> ShardAssignment[nodeIds=[B-MnLxupT2qkXpWstJb6mA], total=1, unassigned=0, ignored=0]
According to the logs from the latest failure:
Shard sizes
shard=[zoxdkcqpuouh][0]=>12166
shard=[zoxdkcqpuouh][3]=>12451
shard=[zoxdkcqpuouh][2]=>12535
shard=[zoxdkcqpuouh][4]=>12594
shard=[zoxdkcqpuouh][5]=>13006
Nodes
"r4ludfGvTh2wZ7UGP-SjWA" / "node_t1" <-- big node
"zIF6uDFXRYKPVDXFSMTRQQ" / "node_t2" <-- small node
Allocation of all shards happens in 2 rounds. Round 1:
routing_nodes:
-----node_id[zIF6uDFXRYKPVDXFSMTRQQ][V]
-----node_id[r4ludfGvTh2wZ7UGP-SjWA][V]
---- unassigned
--------[zoxdkcqpuouh][0], node[null], [P], recovery_source[snapshot recovery [ArTTnq9aTsel6jxW7feZ6Q] from repo:snap/qM6kXprETqqT0R_Qb0Wx-g], s[UNASSIGNED], unassigned_info[[reason=NEW_INDEX_RESTORED], at[2024-04-06T20:17:35.294Z], delayed=false, details[restore_source[repo/snap]], allocation_status[fetching_shard_data]], failed_attempts[0]
--------[zoxdkcqpuouh][1], node[null], [P], recovery_source[snapshot recovery [ArTTnq9aTsel6jxW7feZ6Q] from repo:snap/qM6kXprETqqT0R_Qb0Wx-g], s[UNASSIGNED], unassigned_info[[reason=NEW_INDEX_RESTORED], at[2024-04-06T20:17:35.294Z], delayed=false, details[restore_source[repo/snap]], allocation_status[fetching_shard_data]], failed_attempts[0]
--------[zoxdkcqpuouh][2], node[null], [P], recovery_source[snapshot recovery [ArTTnq9aTsel6jxW7feZ6Q] from repo:snap/qM6kXprETqqT0R_Qb0Wx-g], s[UNASSIGNED], unassigned_info[[reason=NEW_INDEX_RESTORED], at[2024-04-06T20:17:35.294Z], delayed=false, details[restore_source[repo/snap]], allocation_status[fetching_shard_data]], failed_attempts[0]
--------[zoxdkcqpuouh][3], node[null], [P], recovery_source[snapshot recovery [ArTTnq9aTsel6jxW7feZ6Q] from repo:snap/qM6kXprETqqT0R_Qb0Wx-g], s[UNASSIGNED], unassigned_info[[reason=NEW_INDEX_RESTORED], at[2024-04-06T20:17:35.294Z], delayed=false, details[restore_source[repo/snap]], allocation_status[fetching_shard_data]], failed_attempts[0]
--------[zoxdkcqpuouh][4], node[null], [P], recovery_source[snapshot recovery [ArTTnq9aTsel6jxW7feZ6Q] from repo:snap/qM6kXprETqqT0R_Qb0Wx-g], s[UNASSIGNED], unassigned_info[[reason=NEW_INDEX_RESTORED], at[2024-04-06T20:17:35.294Z], delayed=false, details[restore_source[repo/snap]], allocation_status[fetching_shard_data]], failed_attempts[0]
--------[zoxdkcqpuouh][5], node[null], [P], recovery_source[snapshot recovery [ArTTnq9aTsel6jxW7feZ6Q] from repo:snap/qM6kXprETqqT0R_Qb0Wx-g], s[UNASSIGNED], unassigned_info[[reason=NEW_INDEX_RESTORED], at[2024-04-06T20:17:35.294Z], delayed=false, details[restore_source[repo/snap]], allocation_status[fetching_shard_data]], failed_attempts[0]
"shard_sizes" : { },
"reserved_sizes" : [ ]
SnapshotShardSizeInfo{snapshotShardSizes=[
[snapshot=repo:snap/qM6kXprETqqT0R_Qb0Wx-g, index=[zoxdkcqpuouh/ZsO0w7BsQRy3hAFwsIpZ2g], shard=[zoxdkcqpuouh][0]]=>12166,
[snapshot=repo:snap/qM6kXprETqqT0R_Qb0Wx-g, index=[zoxdkcqpuouh/ZsO0w7BsQRy3hAFwsIpZ2g], shard=[zoxdkcqpuouh][1]]=>13187,
[snapshot=repo:snap/qM6kXprETqqT0R_Qb0Wx-g, index=[zoxdkcqpuouh/ZsO0w7BsQRy3hAFwsIpZ2g], shard=[zoxdkcqpuouh][2]]=>12535,
[snapshot=repo:snap/qM6kXprETqqT0R_Qb0Wx-g, index=[zoxdkcqpuouh/ZsO0w7BsQRy3hAFwsIpZ2g], shard=[zoxdkcqpuouh][5]]=>13006,
[snapshot=repo:snap/qM6kXprETqqT0R_Qb0Wx-g, index=[zoxdkcqpuouh/ZsO0w7BsQRy3hAFwsIpZ2g], shard=[zoxdkcqpuouh][4]]=>12594,
[snapshot=repo:snap/qM6kXprETqqT0R_Qb0Wx-g, index=[zoxdkcqpuouh/ZsO0w7BsQRy3hAFwsIpZ2g], shard=[zoxdkcqpuouh][3]]=>12451
]}
[2024-04-06T16:17:35,503][TRACE][o.e.c.r.a.a.DesiredBalanceShardsAllocator] [node_t0] Desired balance updated: DesiredBalance[lastConvergedIndex=16, assignments={
[zoxdkcqpuouh][1]=ShardAssignment[nodeIds=[], total=1, unassigned=1, ignored=1],
[zoxdkcqpuouh][0]=ShardAssignment[nodeIds=[], total=1, unassigned=1, ignored=1],
[zoxdkcqpuouh][5]=ShardAssignment[nodeIds=[r4ludfGvTh2wZ7UGP-SjWA], total=1, unassigned=0, ignored=0],
[zoxdkcqpuouh][4]=ShardAssignment[nodeIds=[], total=1, unassigned=1, ignored=1],
[zoxdkcqpuouh][3]=ShardAssignment[nodeIds=[zIF6uDFXRYKPVDXFSMTRQQ], total=1, unassigned=0, ignored=0],
[zoxdkcqpuouh][2]=ShardAssignment[nodeIds=[r4ludfGvTh2wZ7UGP-SjWA], total=1, unassigned=0, ignored=0]
}]. Diff:
[zoxdkcqpuouh][5]: ShardAssignment[nodeIds=[], total=1, unassigned=1, ignored=1] -> ShardAssignment[nodeIds=[r4ludfGvTh2wZ7UGP-SjWA], total=1, unassigned=0, ignored=0]
[zoxdkcqpuouh][3]: ShardAssignment[nodeIds=[], total=1, unassigned=1, ignored=1] -> ShardAssignment[nodeIds=[zIF6uDFXRYKPVDXFSMTRQQ], total=1, unassigned=0, ignored=0]
[zoxdkcqpuouh][2]: ShardAssignment[nodeIds=[], total=1, unassigned=1, ignored=1] -> ShardAssignment[nodeIds=[r4ludfGvTh2wZ7UGP-SjWA], total=1, unassigned=0, ignored=0]
Nothing out of ordinary here.
Round 2:
routing_nodes:
-----node_id[zIF6uDFXRYKPVDXFSMTRQQ][V]
--------[zoxdkcqpuouh][3], node[zIF6uDFXRYKPVDXFSMTRQQ], [P], recovery_source[snapshot recovery [ArTTnq9aTsel6jxW7feZ6Q] from repo:snap/qM6kXprETqqT0R_Qb0Wx-g], s[INITIALIZING], a[id=092Hbr2cSEakrjwApnkZow], unassigned_info[[reason=NEW_INDEX_RESTORED], at[2024-04-06T20:17:35.294Z], delayed=false, details[restore_source[repo/snap]], allocation_status[no_attempt]], failed_attempts[0], expected_shard_size[12451]
-----node_id[r4ludfGvTh2wZ7UGP-SjWA][V]
--------[zoxdkcqpuouh][2], node[r4ludfGvTh2wZ7UGP-SjWA], [P], recovery_source[snapshot recovery [ArTTnq9aTsel6jxW7feZ6Q] from repo:snap/qM6kXprETqqT0R_Qb0Wx-g], s[INITIALIZING], a[id=TpE9Bw8VTAy1CJz-pDXquQ], unassigned_info[[reason=NEW_INDEX_RESTORED], at[2024-04-06T20:17:35.294Z], delayed=false, details[restore_source[repo/snap]], allocation_status[no_attempt]], failed_attempts[0], expected_shard_size[12535]
--------[zoxdkcqpuouh][5], node[r4ludfGvTh2wZ7UGP-SjWA], [P], recovery_source[snapshot recovery [ArTTnq9aTsel6jxW7feZ6Q] from repo:snap/qM6kXprETqqT0R_Qb0Wx-g], s[INITIALIZING], a[id=2n_rRR9XQteq_585F1FliQ], unassigned_info[[reason=NEW_INDEX_RESTORED], at[2024-04-06T20:17:35.294Z], delayed=false, details[restore_source[repo/snap]], allocation_status[no_attempt]], failed_attempts[0], expected_shard_size[13006]
---- unassigned
--------[zoxdkcqpuouh][0], node[null], [P], recovery_source[snapshot recovery [ArTTnq9aTsel6jxW7feZ6Q] from repo:snap/qM6kXprETqqT0R_Qb0Wx-g], s[UNASSIGNED], unassigned_info[[reason=NEW_INDEX_RESTORED], at[2024-04-06T20:17:35.294Z], delayed=false, details[restore_source[repo/snap]], allocation_status[no_attempt]], failed_attempts[0]
--------[zoxdkcqpuouh][1], node[null], [P], recovery_source[snapshot recovery [ArTTnq9aTsel6jxW7feZ6Q] from repo:snap/qM6kXprETqqT0R_Qb0Wx-g], s[UNASSIGNED], unassigned_info[[reason=NEW_INDEX_RESTORED], at[2024-04-06T20:17:35.294Z], delayed=false, details[restore_source[repo/snap]], allocation_status[no_attempt]], failed_attempts[0]
--------[zoxdkcqpuouh][4], node[null], [P], recovery_source[snapshot recovery [ArTTnq9aTsel6jxW7feZ6Q] from repo:snap/qM6kXprETqqT0R_Qb0Wx-g], s[UNASSIGNED], unassigned_info[[reason=NEW_INDEX_RESTORED], at[2024-04-06T20:17:35.294Z], delayed=false, details[restore_source[repo/snap]], allocation_status[no_attempt]], failed_attempts[0]
"shard_sizes" : {
"[zoxdkcqpuouh][2][p]_bytes" : 0, <=== has reserved 12535 bytes
"[zoxdkcqpuouh][3][p]_bytes" : 12451, <=== has reserved 0 bytes
"[zoxdkcqpuouh][5][p]_bytes" : 0 <=== has no reserved entry
},
"reserved_sizes" : [
{
"node_id" : "zIF6uDFXRYKPVDXFSMTRQQ",
"path" : "/dev/shm/bk/bk-agent-prod-gcp-1712433772034280571/elastic/elasticsearch-periodic/server/build/testrun/internalClusterTest/temp/org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT_E34CC9AAB7634065-001/tempDir-005/node-2",
"total" : 0,
"shards" : [
"[zoxdkcqpuouh][3]"
]
},
{
"node_id" : "r4ludfGvTh2wZ7UGP-SjWA",
"path" : "/dev/shm/bk/bk-agent-prod-gcp-1712433772034280571/elastic/elasticsearch-periodic/server/build/testrun/internalClusterTest/temp/org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT_E34CC9AAB7634065-001/tempDir-005/node-1",
"total" : 12535,
"shards" : [
"[zoxdkcqpuouh][2]"
]
}
]
"zIF6uDFXRYKPVDXFSMTRQQ" : {
"node_name" : "node_t2",
"least_available" : {
"path" : "/dev/shm/bk/bk-agent-prod-gcp-1712433772034280571/elastic/elasticsearch-periodic/server/build/testrun/internalClusterTest/temp/org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT_E34CC9AAB7634065-001/tempDir-005/node-2",
"total_bytes" : 22691,
"used_bytes" : 0,
"free_bytes" : 22691,
"free_disk_percent" : 100.0,
"used_disk_percent" : 0.0
}
},
"r4ludfGvTh2wZ7UGP-SjWA" : {
"node_name" : "node_t1",
"least_available" : {
"path" : "/dev/shm/bk/bk-agent-prod-gcp-1712433772034280571/elastic/elasticsearch-periodic/server/build/testrun/internalClusterTest/temp/org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT_E34CC9AAB7634065-001/tempDir-005/node-1",
"total_bytes" : 63302148096,
"used_bytes" : 9228197888,
"free_bytes" : 54073950208,
"free_disk_percent" : 85.4,
"used_disk_percent" : 14.6
},
SnapshotShardSizeInfo{snapshotShardSizes=[
[snapshot=repo:snap/qM6kXprETqqT0R_Qb0Wx-g, index=[zoxdkcqpuouh/ZsO0w7BsQRy3hAFwsIpZ2g], shard=[zoxdkcqpuouh][1]]=>13187,
[snapshot=repo:snap/qM6kXprETqqT0R_Qb0Wx-g, index=[zoxdkcqpuouh/ZsO0w7BsQRy3hAFwsIpZ2g], shard=[zoxdkcqpuouh][4]]=>12594,
[snapshot=repo:snap/qM6kXprETqqT0R_Qb0Wx-g, index=[zoxdkcqpuouh/ZsO0w7BsQRy3hAFwsIpZ2g], shard=[zoxdkcqpuouh][0]]=>12166
]}
Desired balance updated: DesiredBalance[lastConvergedIndex=17, assignments={
[zoxdkcqpuouh][5]=ShardAssignment[nodeIds=[r4ludfGvTh2wZ7UGP-SjWA], total=1, unassigned=0, ignored=0],
[zoxdkcqpuouh][4]=ShardAssignment[nodeIds=[r4ludfGvTh2wZ7UGP-SjWA], total=1, unassigned=0, ignored=0],
[zoxdkcqpuouh][3]=ShardAssignment[nodeIds=[zIF6uDFXRYKPVDXFSMTRQQ], total=1, unassigned=0, ignored=0],
[zoxdkcqpuouh][2]=ShardAssignment[nodeIds=[r4ludfGvTh2wZ7UGP-SjWA], total=1, unassigned=0, ignored=0],
[zoxdkcqpuouh][1]=ShardAssignment[nodeIds=[r4ludfGvTh2wZ7UGP-SjWA], total=1, unassigned=0, ignored=0],
[zoxdkcqpuouh][0]=ShardAssignment[nodeIds=[zIF6uDFXRYKPVDXFSMTRQQ], total=1, unassigned=0, ignored=0]
}]. Diff:
[zoxdkcqpuouh][4]: ShardAssignment[nodeIds=[], total=1, unassigned=1, ignored=1] -> ShardAssignment[nodeIds=[r4ludfGvTh2wZ7UGP-SjWA], total=1, unassigned=0, ignored=0]
[zoxdkcqpuouh][1]: ShardAssignment[nodeIds=[], total=1, unassigned=1, ignored=1] -> ShardAssignment[nodeIds=[r4ludfGvTh2wZ7UGP-SjWA], total=1, unassigned=0, ignored=0]
[zoxdkcqpuouh][0]: ShardAssignment[nodeIds=[], total=1, unassigned=1, ignored=1] -> ShardAssignment[nodeIds=[zIF6uDFXRYKPVDXFSMTRQQ], total=1, unassigned=0, ignored=0]
The round 2 balance is calculated incorrectly as it is based on incorrect cluster info:
"used_bytes" : 0
even though there is one shard on it already (can be related to TestFileStore and how we mock?)I have not found any particular reason why balance was calculated in 2 round here, however I consider this to be a valid scenario. I also do not think we have any guarantees to have a point in time snapshot for ClusterInfo that can explain the test value.
Above buildscan failure shows in logs there is
Maj 17, 2024 5:37:20 PD com.carrotsearch.randomizedtesting.RandomizedRunner$QueueUncaughtExceptionsHandler uncaughtException
WARNING: Uncaught exception in thread: Thread[#2582,elasticsearch[node_t0][clusterApplierService#updateTask][T#1],5,TGRP-DiskThresholdDeciderIT]
java.lang.AssertionError: Expects no relocating shards but got: routing_nodes:
-----node_id[2_l39Q7SSiWyhgjDJv2lBg][V]
--------[fvyruaxlr][0], node[2_l39Q7SSiWyhgjDJv2lBg], [P], s[STARTED], a[id=SGQo56cyQOuyPEmhwMgeuA], failed_attempts[0]
--------[fvyruaxlr][1], node[2_l39Q7SSiWyhgjDJv2lBg], relocating [s_UbSUETQ-GejfrGhwmCfw], [P], recovery_source[peer recovery], s[INITIALIZING], a[id=5-Mwd2VLSoWPJ5EcXzlrHA, rId=x11KnGalToKb1w59JaOxcQ], failed_attempts[0], expected_shard_size[21221]
--------[fvyruaxlr][3], node[2_l39Q7SSiWyhgjDJv2lBg], [P], s[STARTED], a[id=N1UYVQ_aThyoOxBMlQdyGQ], failed_attempts[0]
--------[fvyruaxlr][4], node[2_l39Q7SSiWyhgjDJv2lBg], [P], s[STARTED], a[id=09TJ-r24QQK5MTD-GaJMcQ], failed_attempts[0]
--------[fvyruaxlr][5], node[2_l39Q7SSiWyhgjDJv2lBg], [P], s[STARTED], a[id=7sbAsKiUQBWSwlk2fdFDlQ], failed_attempts[0]
-----node_id[s_UbSUETQ-GejfrGhwmCfw][V]
--------[fvyruaxlr][1], node[s_UbSUETQ-GejfrGhwmCfw], relocating [2_l39Q7SSiWyhgjDJv2lBg], [P], s[RELOCATING], a[id=x11KnGalToKb1w59JaOxcQ, rId=5-Mwd2VLSoWPJ5EcXzlrHA], failed_attempts[0], expected_shard_size[21221]
--------[fvyruaxlr][2], node[s_UbSUETQ-GejfrGhwmCfw], [P], s[STARTED], a[id=4Htw-f0yStK2qMjoNC7LNA], failed_attempts[0]
---- unassigned
Expected: <0>
but: was <1>
at __randomizedtesting.SeedInfo.seed([360E9317DD5AD10]:0)
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18)
at org.elasticsearch.test.ESTestCase.assertThat(ESTestCase.java:2123)
at org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT.lambda$testRestoreSnapshotAllocationDoesNotExceedWatermarkWithMultipleShards$2(DiskThresholdDeciderIT.java:186)
at org.elasticsearch.cluster.service.ClusterApplierService.callClusterStateListener(ClusterApplierService.java:560)
at org.elasticsearch.cluster.service.ClusterApplierService.callClusterStateListeners(ClusterApplierService.java:546)
at org.elasticsearch.cluster.service.ClusterApplierService.applyChanges(ClusterApplierService.java:505)
at org.elasticsearch.cluster.service.ClusterApplierService.runTask(ClusterApplierService.java:429)
at org.elasticsearch.cluster.service.ClusterApplierService$UpdateTask.run(ClusterApplierService.java:154)
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:917)
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:217)
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:183)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
at java.base/java.lang.Thread.run(Thread.java:1583)
My debug logging...
REPRODUCE WITH: ./gradlew ":server:internalClusterTest" --tests "org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT.testRestoreSnapshotAllocationDoesNotExceedWatermarkWithMultipleShards {seed=[360E9317DD5AD10:9C6621EC0B9A256A]}" -Dtests.seed=360E9317DD5AD10 -Dtests.filter=@awaitsfix -Dtests.locale=sq -Dtests.timezone=America/Phoenix -Druntime.java=21
DiskThresholdDeciderIT > testRestoreSnapshotAllocationDoesNotExceedWatermarkWithMultipleShards {seed=[360E9317DD5AD10:9C6621EC0B9A256A]} FAILED
java.lang.AssertionError: Mismatching shard routings: []
Expected: iterable containing [one of {<[sqikodzgne][3]>, <[sqikodzgne][2]>}]
but: no item was one of {<[sqikodzgne][3]>, <[sqikodzgne][2]>}
at __randomizedtesting.SeedInfo.seed([360E9317DD5AD10:9C6621EC0B9A256A]:0)
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18)
at org.elasticsearch.test.ESTestCase.assertThat(ESTestCase.java:2470)
at org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT.lambda$assertBusyWithDiskUsageRefresh$7(DiskThresholdDeciderIT.java:360)
at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:1408)
at org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT.assertBusyWithDiskUsageRefresh(DiskThresholdDeciderIT.java:350)
at org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT.testRestoreSnapshotAllocationDoesNotExceedWatermarkWithMultipleShards(DiskThresholdDeciderIT.java:243)
1> [2024-08-29T08:44:58,238][INFO ][o.e.c.r.a.d.DiskThresholdDeciderIT] [testRestoreSnapshotAllocationDoesNotExceedWatermarkWithMultipleShards] ~~~test done creating data for shards, shardSizes: [
ShardSize[shardId=[uaxlrfmucfxo][0], size=22976],
ShardSize[shardId=[uaxlrfmucfxo][2], size=23049],
ShardSize[shardId=[uaxlrfmucfxo][4], size=23122],
ShardSize[shardId=[uaxlrfmucfxo][3], size=23271],
ShardSize[shardId=[uaxlrfmucfxo][1], size=23948],
ShardSize[shardId=[uaxlrfmucfxo][5], size=24030]]
1> [2024-08-29T08:44:58,462][INFO ][o.e.c.r.a.d.DiskThresholdDeciderIT] [testRestoreSnapshotAllocationDoesNotExceedWatermarkWithMultipleShards] ~~~test decreased disk space and refreshed it, WATERMARK_BYTES: 10240, usableSpace: 23049
1> [2024-08-29T08:44:58,658][INFO ][o.e.c.r.a.d.DiskThresholdDeciderIT] [testRestoreSnapshotAllocationDoesNotExceedWatermarkWithMultipleShards] ~~~test matcher shard set: [[uaxlrfmucfxo][0], [uaxlrfmucfxo][2]]
1> [2024-08-29T08:44:58,666][INFO ][o.e.c.r.a.d.DiskThresholdDeciderIT] [testRestoreSnapshotAllocationDoesNotExceedWatermarkWithMultipleShards] ~~~assertBusyWithDiskUsageRefresh refreshDiskUsage complete, checking, shardRouting: [[uaxlrfmucfxo][2]]
The matcher has what we want, and the shard is on the node that we want..
Maybe this 🤔
Okay, so I fixed the Matcher problem. Replace this line with the following
assertBusyWithDiskUsageRefresh(dataNode0Id, indexName, hasItem(in(shardSizes.getShardIdsWithSizeSmallerOrEqual(usableSpace))));
But now I've uncovered something else, either a test bug or code bug. These are my debug logs limited to this line of code where the snapshot restore occurs and allocates shards out to the nodes -- note that rebalancing is otherwise disabled, we won't rebalance after the initial allocation, IIUC.
A potentially interesting bit is this, before allocation occurs: somehow allocation thinks the available on the node is zero?
1> [2024-08-29T12:07:08,531][WARN ][o.e.c.r.a.DiskThresholdMonitor] [node_t0] high disk watermark [10240b] exceeded on [q24VcceoTY-cq6vgafXpOQ][node_t2][/home/dianna/elasticsearch/server/build/testrun/internalClusterTest/temp/org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT_360E9317DD5AD10-019/tempDir-035/node-2] free: 0b[0%], shards will be relocated away from this node; currently relocating away shards totalling [0] bytes; the node is expected to continue to exceed the high disk watermark when these relocations are complete
Again later,
1> [2024-08-29T12:07:08,538][TRACE][o.e.c.r.a.d.DiskThresholdDecider] [node_t0] getDiskUsage(subtractLeavingShards=false) returning DiskUsageWithRelocations[diskUsage=[q24VcceoTY-cq6vgafXpOQ][node_t2][/home/dianna/elasticsearch/server/build/testrun/internalClusterTest/temp/org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT_360E9317DD5AD10-019/tempDir-035/node-2] free: 0b[0%], relocatingShardSize=0]
1> [2024-08-29T12:07:08,538][TRACE][o.e.c.r.a.d.DiskThresholdDecider] [node_t0] node [q24VcceoTY-cq6vgafXpOQ] has 100.0% used disk
1> [2024-08-29T12:07:08,538][DEBUG][o.e.c.r.a.d.DiskThresholdDecider] [node_t0] less than the required 10240 free bytes threshold (0b free) on node q24VcceoTY-cq6vgafXpOQ, preventing allocation
100% disk usage until the end (last occurrence of log message)
1> [2024-08-29T12:07:08,559][TRACE][o.e.c.r.a.d.DiskThresholdDecider] [node_t0] node [q24VcceoTY-cq6vgafXpOQ] has 100.0% used disk
These are the logs written while these lines of code executed, which change the disk size. It looks like
1> [2024-08-29T12:07:08,514][DEBUG][o.e.c.r.a.a.DesiredBalanceShardsAllocator] [node_t0] Executing allocate for [11]
1> [2024-08-29T12:07:08,514][INFO ][o.e.c.r.a.d.DiskThresholdDeciderIT] [testRestoreSnapshotAllocationDoesNotExceedWatermarkWithMultipleShards] Refreshed cluster info: {
1> "nodes" : {
1> "Hh_cFSDBQxSRw0flPCpXYw" : {
1> "node_name" : "node_t1",
1> "least_available" : {
1> "path" : "/home/dianna/elasticsearch/server/build/testrun/internalClusterTest/temp/org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT_360E9317DD5AD10-019/tempDir-035/node-1",
1> "total_bytes" : 364024733696,
1> "used_bytes" : 78748274688,
1> "free_bytes" : 285276459008,
1> "free_disk_percent" : 78.4,
1> "used_disk_percent" : 21.6
1> },
1> "most_available" : {
1> "path" : "/home/dianna/elasticsearch/server/build/testrun/internalClusterTest/temp/org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT_360E9317DD5AD10-019/tempDir-035/node-1",
1> "total_bytes" : 364024733696,
1> "used_bytes" : 78748274688,
1> "free_bytes" : 285276459008,
1> "free_disk_percent" : 78.4,
1> "used_disk_percent" : 21.6
1> }
1> },
1> "q24VcceoTY-cq6vgafXpOQ" : {
1> "node_name" : "node_t2",
1> "least_available" : {
1> "path" : "/home/dianna/elasticsearch/server/build/testrun/internalClusterTest/temp/org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT_360E9317DD5AD10-019/tempDir-035/node-2",
1> "total_bytes" : 29975,
1> "used_bytes" : 29975,
1> "free_bytes" : 0,
1> "free_disk_percent" : 0.0,
1> "used_disk_percent" : 100.0
1> },
1> "most_available" : {
1> "path" : "/home/dianna/elasticsearch/server/build/testrun/internalClusterTest/temp/org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT_360E9317DD5AD10-019/tempDir-035/node-2",
1> "total_bytes" : 29975,
1> "used_bytes" : 29975,
1> "free_bytes" : 0,
1> "free_disk_percent" : 0.0,
1> "used_disk_percent" : 100.0
1> }
1> }
1> },
1> "shard_sizes" : { },
1> "shard_data_set_sizes" : { },
1> "shard_paths" : { },
1> "reserved_sizes" : [ ]
1> }
1> [2024-08-29T12:07:08,514][DEBUG][o.e.c.r.a.a.DesiredBalanceShardsAllocator] [node_t0] Starting desired balance computation for [11]
1> [2024-08-29T12:07:08,514][TRACE][o.e.c.r.a.a.DesiredBalanceShardsAllocator] [node_t0] Reconciling desired balance: DesiredBalance[lastConvergedIndex=10, assignments={}]
1> [2024-08-29T12:07:08,514][DEBUG][o.e.c.r.a.a.DesiredBalanceReconciler] [node_t0] Reconciling desired balance for [10]
1> [2024-08-29T12:07:08,514][TRACE][o.e.c.r.a.a.DesiredBalanceShardsAllocator] [node_t0] Reconciled desired balance: DesiredBalance[lastConvergedIndex=10, assignments={}]
1> [2024-08-29T12:07:08,514][TRACE][o.e.c.r.a.a.DesiredBalanceComputer] [node_t0] Recomputing desired balance for [11]: DesiredBalance[lastConvergedIndex=10, assignments={}], routing_nodes:
1> -----node_id[q24VcceoTY-cq6vgafXpOQ][V]
1> -----node_id[Hh_cFSDBQxSRw0flPCpXYw][V]
1> ---- unassigned
1> , {
1> "nodes" : {
1> "Hh_cFSDBQxSRw0flPCpXYw" : {
1> "node_name" : "node_t1",
1> "least_available" : {
1> "path" : "/home/dianna/elasticsearch/server/build/testrun/internalClusterTest/temp/org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT_360E9317DD5AD10-019/tempDir-035/node-1",
1> "total_bytes" : 364024733696,
1> "used_bytes" : 78748274688,
1> "free_bytes" : 285276459008,
1> "free_disk_percent" : 78.4,
1> "used_disk_percent" : 21.6
1> },
1> "most_available" : {
1> "path" : "/home/dianna/elasticsearch/server/build/testrun/internalClusterTest/temp/org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT_360E9317DD5AD10-019/tempDir-035/node-1",
1> "total_bytes" : 364024733696,
1> "used_bytes" : 78748274688,
1> "free_bytes" : 285276459008,
1> "free_disk_percent" : 78.4,
1> "used_disk_percent" : 21.6
1> }
1> },
1> "q24VcceoTY-cq6vgafXpOQ" : {
1> "node_name" : "node_t2",
1> "least_available" : {
1> "path" : "/home/dianna/elasticsearch/server/build/testrun/internalClusterTest/temp/org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT_360E9317DD5AD10-019/tempDir-035/node-2",
1> "total_bytes" : 29975,
1> "used_bytes" : 29975,
1> "free_bytes" : 0,
1> "free_disk_percent" : 0.0,
1> "used_disk_percent" : 100.0
1> },
1> "most_available" : {
1> "path" : "/home/dianna/elasticsearch/server/build/testrun/internalClusterTest/temp/org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT_360E9317DD5AD10-019/tempDir-035/node-2",
1> "total_bytes" : 29975,
1> "used_bytes" : 29975,
1> "free_bytes" : 0,
1> "free_disk_percent" : 0.0,
1> "used_disk_percent" : 100.0
1> }
1> }
1> },
1> "shard_sizes" : { },
1> "shard_data_set_sizes" : { },
1> "shard_paths" : { },
1> "reserved_sizes" : [ ]
1> }, SnapshotShardSizeInfo{snapshotShardSizes=[]}
1> [2024-08-29T12:07:08,514][TRACE][o.e.c.r.a.a.DesiredBalanceComputer] [node_t0] running delegate allocator
1> [2024-08-29T12:07:08,514][TRACE][o.e.c.r.a.a.BalancedShardsAllocator] [node_t0] Start allocating unassigned shards
1> [2024-08-29T12:07:08,515][TRACE][o.e.c.r.a.a.BalancedShardsAllocator] [node_t0] Start balancing cluster
1> [2024-08-29T12:07:08,515][DEBUG][o.e.c.r.a.a.DesiredBalanceComputer] [node_t0] Desired balance computation for [11] converged after [0s] and [1] iterations
1> [2024-08-29T12:07:08,515][TRACE][o.e.c.r.a.a.DesiredBalanceShardsAllocator] [node_t0] Desired balance updated: DesiredBalance[lastConvergedIndex=11, assignments={}]. No changes
1> [2024-08-29T12:07:08,515][DEBUG][o.e.c.r.a.a.DesiredBalanceShardsAllocator] [node_t0] Desired balance computation for [11] is completed, scheduling reconciliation
1> [2024-08-29T12:07:08,515][TRACE][o.e.c.r.a.a.DesiredBalanceShardsAllocator] [node_t0] Reconciling desired balance: DesiredBalance[lastConvergedIndex=11, assignments={}]
1> [2024-08-29T12:07:08,515][DEBUG][o.e.c.r.a.a.DesiredBalanceReconciler] [node_t0] Reconciling desired balance for [11]
1> [2024-08-29T12:07:08,515][TRACE][o.e.c.r.a.a.DesiredBalanceShardsAllocator] [node_t0] Reconciled desired balance: DesiredBalance[lastConvergedIndex=11, assignments={}]
1> [2024-08-29T12:07:08,515][WARN ][o.e.c.r.a.DiskThresholdMonitor] [node_t0] high disk watermark [10240b] exceeded on [q24VcceoTY-cq6vgafXpOQ][node_t2][/home/dianna/elasticsearch/server/build/testrun/internalClusterTest/temp/org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT_360E9317DD5AD10-019/tempDir-035/node-2] free: 0b[0%], shards will be relocated away from this node; currently relocating away shards totalling [0] bytes; the node is expected to continue to exceed the high disk watermark when these relocations are complete
The interesting bit is
1> "q24VcceoTY-cq6vgafXpOQ" : {
1> "node_name" : "node_t2",
1> "least_available" : {
1> "path" : "/home/dianna/elasticsearch/server/build/testrun/internalClusterTest/temp/org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT_360E9317DD5AD10-019/tempDir-035/node-2",
1> "total_bytes" : 29975,
1> "used_bytes" : 29975,
1> "free_bytes" : 0,
1> "free_disk_percent" : 0.0,
1> "used_disk_percent" : 100.0
1> },
1> "most_available" : {
1> "path" : "/home/dianna/elasticsearch/server/build/testrun/internalClusterTest/temp/org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT_360E9317DD5AD10-019/tempDir-035/node-2",
1> "total_bytes" : 29975,
1> "used_bytes" : 29975,
1> "free_bytes" : 0,
1> "free_disk_percent" : 0.0,
1> "used_disk_percent" : 100.0
1> }
1> }
We changed the total_bytes, but the used_bytes match. Even though there are no shards allocated.
Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination)
Build scan: https://gradle-enterprise.elastic.co/s/kxdqmnytyenuq/tests/:server:internalClusterTest/org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT/testRestoreSnapshotAllocationDoesNotExceedWatermarkWithMultipleShards
Reproduction line:
Applicable branches: main
Reproduces locally?: No
Failure history: Failure dashboard for
org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT#testRestoreSnapshotAllocationDoesNotExceedWatermarkWithMultipleShards
&_a=(controlGroupInput:(chainingSystem:HIERARCHICAL,controlStyle:twoLine,ignoreParentSettings:(ignoreFilters:!f,ignoreQuery:!f,ignoreTimerange:!f,ignoreValidations:!t),panels:('0c0c9cb8-ccd2-45c6-9b13-96bac4abc542':(explicitInput:(dataViewId:fbbdc689-be23-4b3d-8057-aa402e9ed0c5,enhancements:(),fieldName:task.keyword,grow:!t,id:'0c0c9cb8-ccd2-45c6-9b13-96bac4abc542',searchTechnique:wildcard,selectedOptions:!(),singleSelect:!t,title:'Gradle%20Task',width:medium),grow:!t,order:0,type:optionsListControl,width:small),'144933da-5c1b-4257-a969-7f43455a7901':(explicitInput:(dataViewId:fbbdc689-be23-4b3d-8057-aa402e9ed0c5,enhancements:(),fieldName:name.keyword,grow:!t,id:'144933da-5c1b-4257-a969-7f43455a7901',searchTechnique:wildcard,selectedOptions:!('testRestoreSnapshotAllocationDoesNotExceedWatermarkWithMultipleShards'),title:Test,width:medium),grow:!t,order:2,type:optionsListControl,width:medium),'4e6ad9d6-6fdc-4fcc-bf1a-aa6ca79e0850':(explicitInput:(dataViewId:fbbdc689-be23-4b3d-8057-aa402e9ed0c5,enhancements:(),fieldName:className.keyword,grow:!t,id:'4e6ad9d6-6fdc-4fcc-bf1a-aa6ca79e0850',searchTechnique:wildcard,selectedOptions:!('org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT'),title:Suite,width:medium),grow:!t,order:1,type:optionsListControl,width:medium)))))Failure excerpt: