Open Rajaeelfarsi opened 6 months ago
Hi @lukas-vlcek,
Thank you for your response. I really appreciate your feedback and guidance.
I wanted to mention that I am relatively new to Opensearch and development in general, so I may not have a deep understanding of all the intricacies yet. However, I am eager to learn and improve.
Regarding the corrections I plan to make, I based my implementation on the API GET /_cat/shards, which provides the different states(types) of a shard:
(Default) State of the shard. Returned values are:
**INITIALIZING**: The shard is recovering from a peer shard or gateway.
**RELOCATING**: The shard is relocating.
**STARTED**: The shard has started.
**UNASSIGNED**: The shard is not assigned to any node.
I will follow your example for the cluster and implement the same approach for each node.
However, I have a question regarding the inclusion of the "unassigned" state. As far as I understand, a shard in the "unassigned" state is not assigned to any node, unlike the cluster level where there can be unassigned shards. Therefore, I am unsure about the need to include the "unassigned" state for each node, as nodes do not typically contain shards with the "unassigned" state.
If you could provide further clarification on why I should include the "unassigned" state for each node, I would greatly appreciate it.
Thank you once again for your support.
Best regards, Rajae
Would you consider also exposing the currently configured cluster.max_shards_per_node
?
This would allow for a dead simple alert when you're nearing the limit and not require an update to your monitoring if you change the max_shards_per_node
.
Other than parsing the metric when scraping, it should be almost free to store in any vector database as it's essentially a static number.
@Baarsgaard Makes sense!
nodes do not typically contain shards with the "unassigned" state.
This is the exact reason I would include unassigned shards as well, simply because it's atypical and I would therefore like it exposed.
Any status on this, I would love to be able to have per node shard monitoring / tracking as I move away from ElasticSearch
Okey, let's make some progress with this over the next week. 💪
Any movement on this, this is a key monitoring feature we are currently missing.
Description
https://github.com/Aiven-Open/prometheus-exporter-plugin-for-opensearch/issues/189
Following the issue opened by arob1n, I made some modifications to the source code to add the metric that gives the number of shard per node. I've named the metric nodes_shards_number.
DCO stands for Developer Certificate of Origin and it is your declaration that your contribution is correctly attributed and licensed. Please read more about how to attach DCO to your commits here (spoiler alert: in most cases it is as simple as using
-s
option when doinggit commit
).Please be aware that commits without DCO will cause failure of PR CI workflow and can not be merged.