elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
1.51k stars 24.89k forks source link

ES shard allocation bug #110826

Closed LuPan92 closed 4 months ago

LuPan92 commented 4 months ago

Elasticsearch Version

Version: 7.17.18, Build: default/tar/8682172c2130b9a411b1bd5ff37c9792367de6b0/2024-02-02T12:04:59.691750271Z, JVM: 11.0.20

Installed Plugins

No response

Java Version

11.0.20

OS Version

Linux bsa5295 3.10.0-1160.108.1.el7.x86_64 #1 SMP Thu Jan 25 16:17:31 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

Problem Description

When the path.data length of the es data node exceeds 20, all shards of the same index will be allocated to one path. Causes disk io skew when writing.

Steps to Reproduce

My test steps are as follows

  1. elasticsearch.yml

    cluster.name: ISOP_1720490318878
    http.port: 19399
    network.host: bsa5295
    node.name: bsa5295
    bootstrap.memory_lock: false
    bootstrap.system_call_filter: false
    node.master: true
    node.data: true
    path.logs: /home/worker/elasticsearch/logs
    path.data: /home/sdf/elasticsearch/data,/home/sdg/elasticsearch/data,/home/sdh/elasticsearch/data,/home/sdi/elasticsearch/data,/home/sdb/elasticsearch/data,/home/sdc/elasticsearch/data,/home/sdd/elasticsearch/data,/home/sde/elasticsearch/data,/home/sdj/elasticsearch/data,/home/sdk/elasticsearch/data,/home/sdf/elasticsearch_1/data,/home/sdg/elasticsearch_1/data,/home/sdh/elasticsearch_1/data,/home/sdi/elasticsearch_1/data,/home/sdb/elasticsearch_1/data,/home/sdc/elasticsearch_1/data,/home/sdd/elasticsearch_1/data,/home/sde/elasticsearch_1/data,/home/sdj/elasticsearch_1/data,/home/sdk/elasticsearch_1/data,/home/sdf/elasticsearch_2/data,/home/sdg/elasticsearch_2/data
    transport.tcp.port: 9300
    gateway.expected_nodes: 1
    action.auto_create_index: .watches,.triggered_watches,.watcher-history-*,.kibana*,.security,.monitoring*
    discovery.seed_hosts: [bsa5295]
    cluster.initial_master_nodes: [bsa5295]
    thread_pool.write.queue_size: 2000
    indices.recovery.max_bytes_per_sec: 200mb
    cluster.routing.allocation.node_concurrent_recoveries: 10
    cluster.max_shards_per_node: 5000
    cluster.routing.allocation.same_shard.host: true
    cluster.routing.allocation.disk.watermark.low: 90%
    cluster.routing.allocation.disk.watermark.high: 95%
    cluster.fault_detection.follower_check.timeout: 180s
    cluster.fault_detection.follower_check.retry_count: 10
    cluster.fault_detection.follower_check.interval: 10s
    cluster.publish.timeout: 1800s
    indices.fielddata.cache.size: 10%
    indices.memory.index_buffer_size: 10%
    xpack.ml.enabled: false
    cluster.election.duration: 30s
    cluster.join.timeout: 360s
    node.processors: 80
  2. Create index my_index1

    curl -X PUT "bsa5295:19399/my_index1" -H 'Content-Type: application/json' -d'
    {
    "settings": {
    "number_of_shards": 25,
    "number_of_replicas": 0
    }
    }'
  3. View index uuid

    [worker@bsa5295 ~]$ curl bsa5295:19399/_cat/indices | grep my_index1
    green open  my_index1                                 fI4auV0lRtmxeYN8XrXf8g 25 0         0      0   5.5kb   5.5kb
  4. View the path corresponding to the shard 企业微信截图_b672c602-b73c-49ad-8cb5-76ce64d5fd38

  5. You can see that all shards are allocated under /home/sdj/elasticsearch

  6. expected behavior:

Logs (if relevant)

No response

elasticsearchmachine commented 4 months ago

Pinging @elastic/es-distributed (Team:Distributed)

mhl-b commented 4 months ago

Does this answer your question?

https://www.elastic.co/guide/en/elasticsearch/reference/7.17/important-settings.html#_multiple_data_paths

If needed, you can specify multiple paths in path.data. Elasticsearch stores the node’s data across all provided paths but keeps each shard’s data on the same path.

Elasticsearch does not balance shards across a node’s data paths. High disk usage in a single path can trigger a high disk usage watermark for the entire node. If triggered, Elasticsearch will not add shards to the node, even if the node’s other paths have available disk space. If you need additional disk space, we recommend you add a new node rather than additional data paths.

LuPan92 commented 4 months ago

Does this answer your question?

https://www.elastic.co/guide/en/elasticsearch/reference/7.17/important-settings.html#_multiple_data_paths

If needed, you can specify multiple paths in path.data. Elasticsearch stores the node’s data across all provided paths but keeps each shard’s data on the same path. Elasticsearch does not balance shards across a node’s data paths. High disk usage in a single path can trigger a high disk usage watermark for the entire node. If triggered, Elasticsearch will not add shards to the node, even if the node’s other paths have available disk space. If you need additional disk space, we recommend you add a new node rather than additional data paths.

I checked the disk usage of each path in path.data. The high disk usage watermark we configured has not yet been reached. The disk usage of each path is as follows: 企业微信截图_fbab5f0c-2483-4b70-bc19-85ba7b50b571

Supplement: When I reduce the length of path.data to less than 20 paths, the problem magically disappears

mhl-b commented 4 months ago

When the path.data length of the es data node exceeds 20, all shards of the same index will be allocated to one path. Causes disk io skew when writing.

Not sure whats the disk io skew in your case, you might need to check your disk performance. About all shards goes to the same path, then it's documented and expected behaviour. See link provided above first paragraph. Following:

If needed, you can specify multiple paths in path.data. Elasticsearch stores the node’s data across all provided paths but keeps each shard’s data on the same path.

LuPan92 commented 4 months ago

expected behavior:

mhl-b commented 4 months ago

Thanks for your interested in Elasticsearch. We are closing this issue as multiple data path feature is deprecated and we are not going to fix this issue.

LuPan92 commented 1 month ago

I have found the answer to this question and wrote a detailed blog. ES 最隐藏的 shard 分配问题