The Elasticsearch Exporter creates some of its indexes with three shards and many with only one shard.
This leads to uneven CPU utilization on multi-node ES clusters.
To Reproduce
Install Zeebe & Elasticsearch with the Helm chart.
1 would make sense for basic single-node ES setups, but it cannot be changed easily if customers add more ES nodes later
2 would match the number of nodes created by the Helm chart
3 may be safer for customers that are not aware of this detail to be able to fully utilize their typical 3-node ES clusters but I don't know how much overhead it creates on a single or dual node ES cluster.
Getting this wrong in the beginning may cause difficult migrations later. Therefore, this should be much better documented and/or the Helm chart should proactively set the environment variable ZEEBE_BROKER_EXPORTERS_ELASTICSEARCH_ARGS_INDEX_NUMBEROFSHARDS to match the number of Elasticsearch replicas.
Log/Stacktrace
However, Elasticsearch shows some indexes are created with 3 shards and many with only 1 shard as shown by column pri in this table:
Full Stacktrace
```sh
$ curl --location ‘http://127.0.0.1:9200/_cat/indices/zeebe*?v=true&s=index&pretty’
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
green open zeebe-record_command-distribution_8.4.6_2024-05-02 gXkrrBVXQyu6KBd2eGw6OQ 1 0 504 0 128.7kb 128.7kb
green open zeebe-record_deployment_8.4.6_2024-05-02 c8JaLs4xQoWX0AwnY06o4w 1 0 252 15 93.7kb 93.7kb
green open zeebe-record_incident_8.4.6_2024-05-02 qNMm-0THStGynaXAsuSMvg 1 0 188129 0 33.4mb 33.4mb
green open zeebe-record_job_8.4.6_2024-05-02 kOAHq69ATGC_bhjLkVt9jA 3 0 1064676 0 178.2mb 178.2mb
green open zeebe-record_message-start-event-subscription_8.4.6_2024-05-02 0t2FxezQSOODZfk5BhkTfw 1 0 123367 1491 21mb 21mb
green open zeebe-record_message_8.4.6_2024-05-02 6NRoNIDTSTqL-IuMges4cg 1 0 125918 0 18.8mb 18.8mb
green open zeebe-record_process-instance_8.4.6_2024-05-02 8IkFn6y6R6GyWUpktsJYvQ 3 0 9628052 268819 1.5gb 1.5gb
green open zeebe-record_process_8.4.6_2024-05-02 fmw8mu7CQGOqCPsvk4Cxkg 1 0 252 29 989.4kb 989.4kb
green open zeebe-record_variable_8.4.6_2024-05-02 Pw86QbpQRmm94ljLq7QZEA 1 0 7753027 116033 1gb 1gb
```
Environment:
OS: GKE
Zeebe Version: 8.4.6
Configuration: Helm Chart 9.3.3
1561 has set number_of_shards in several index templates to 1, while others may have already had the value 3.
Setting it explicitly through the undocumented environment variable ZEEBE_BROKER_EXPORTERS_ELASTICSEARCH_ARGS_INDEX_NUMBEROFSHARDS does work:
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
green open zeebe-record_command-distribution_8.4.6_2024-05-02 -qfmOofCTourrMDZjLFpmw 3 0 648 0 146.4kb 146.4kb
green open zeebe-record_deployment_8.4.6_2024-05-02 6DrQDw6LQqWFYw1xJRjt0g 3 0 319 0 366.4kb 366.4kb
green open zeebe-record_incident_8.4.6_2024-05-02 ABF2piFuTtumHQEjlMD8Cw 3 0 496 0 707.1kb 707.1kb
green open zeebe-record_job_8.4.6_2024-05-02 IroIBBZpRLunJyX3IeE_ww 3 0 78951 0 23.2mb 23.2mb
green open zeebe-record_message-start-event-subscription_8.4.6_2024-05-02 rt0d40zNTfyekrriQj6tkQ 3 0 1933 0 1mb 1mb
green open zeebe-record_message_8.4.6_2024-05-02 5Nvi8yRUQ16Bqm7y618jTA 3 0 5994 0 1.5mb 1.5mb
green open zeebe-record_process-instance_8.4.6_2024-05-02 Ifda3NJQSAWal2fRoMP1Og 3 0 684610 0 130.8mb 130.8mb
green open zeebe-record_process_8.4.6_2024-05-02 VkC-ZM-xSHS5-nQ_Ma2Y4g 3 0 288 0 1.2mb 1.2mb
green open zeebe-record_variable_8.4.6_2024-05-02 Z_6upQgwTJiwETHMOihAFw 3 0 555300 0 96.1mb 96.1mb
there is no need for all shard counts to be the same for all indices, however if the config indicates the default is 3 it should be applied => considering this a bug
while there we should also check the behavior for replicas
backporting may change behavior for old versions => new indices will use more or less shards if the user didn't override the settings themselves => should be mentioned in the update guide to 8.6
Describe the bug
The Elasticsearch Exporter creates some of its indexes with three shards and many with only one shard. This leads to uneven CPU utilization on multi-node ES clusters.
To Reproduce
Install Zeebe & Elasticsearch with the Helm chart.
Expected behavior
All indexes have the same number of shards, and the default value is correctly documented in the configuration file template: https://github.com/camunda/zeebe/blob/c4b3a8745718bfd58959db6c684e0060a4baa455/dist/src/main/config/broker.yaml.template#L665
I'm unsure about the default value:
1
would make sense for basic single-node ES setups, but it cannot be changed easily if customers add more ES nodes later2
would match the number of nodes created by the Helm chart3
may be safer for customers that are not aware of this detail to be able to fully utilize their typical 3-node ES clusters but I don't know how much overhead it creates on a single or dual node ES cluster.Getting this wrong in the beginning may cause difficult migrations later. Therefore, this should be much better documented and/or the Helm chart should proactively set the environment variable
ZEEBE_BROKER_EXPORTERS_ELASTICSEARCH_ARGS_INDEX_NUMBEROFSHARDS
to match the number of Elasticsearch replicas.Log/Stacktrace
However, Elasticsearch shows some indexes are created with 3 shards and many with only 1 shard as shown by column
pri
in this table:Full Stacktrace
```sh $ curl --location ‘http://127.0.0.1:9200/_cat/indices/zeebe*?v=true&s=index&pretty’ health status index uuid pri rep docs.count docs.deleted store.size pri.store.size green open zeebe-record_command-distribution_8.4.6_2024-05-02 gXkrrBVXQyu6KBd2eGw6OQ 1 0 504 0 128.7kb 128.7kb green open zeebe-record_deployment_8.4.6_2024-05-02 c8JaLs4xQoWX0AwnY06o4w 1 0 252 15 93.7kb 93.7kb green open zeebe-record_incident_8.4.6_2024-05-02 qNMm-0THStGynaXAsuSMvg 1 0 188129 0 33.4mb 33.4mb green open zeebe-record_job_8.4.6_2024-05-02 kOAHq69ATGC_bhjLkVt9jA 3 0 1064676 0 178.2mb 178.2mb green open zeebe-record_message-start-event-subscription_8.4.6_2024-05-02 0t2FxezQSOODZfk5BhkTfw 1 0 123367 1491 21mb 21mb green open zeebe-record_message_8.4.6_2024-05-02 6NRoNIDTSTqL-IuMges4cg 1 0 125918 0 18.8mb 18.8mb green open zeebe-record_process-instance_8.4.6_2024-05-02 8IkFn6y6R6GyWUpktsJYvQ 3 0 9628052 268819 1.5gb 1.5gb green open zeebe-record_process_8.4.6_2024-05-02 fmw8mu7CQGOqCPsvk4Cxkg 1 0 252 29 989.4kb 989.4kb green open zeebe-record_variable_8.4.6_2024-05-02 Pw86QbpQRmm94ljLq7QZEA 1 0 7753027 116033 1gb 1gb ```
Environment:
1561 has set
number_of_shards
in several index templates to1
, while others may have already had the value3
.Setting it explicitly through the undocumented environment variable
ZEEBE_BROKER_EXPORTERS_ELASTICSEARCH_ARGS_INDEX_NUMBEROFSHARDS
does work: