Open Barteus opened 2 weeks ago
Thank you for reporting us your feedback!
The internal ticket has been created: https://warthogs.atlassian.net/browse/DPE-5302.
This message was autogenerated
The issue disappears when you remove data-integrator
Hello @Barteus thank you for reporting the issue.
The debug-log shows that a shard has not been assigned, I assume it is the one created by data-integrator (test-index
). Therefore the Opensearch database is in yellow
state and not healthy.
I will try to reproduce and investigate why the shard stays unassigned. If you have more logs, e.g. the Opensearch server logs from your juju unit (can be found in /var/snap/opensearch/common/var/log/opensearch/your_cluster_name.log), please provide as well.
@Barteus After investigating I found that the cause for this behaviour is that only one node of Opensearch was deployed.
The test-index
primary shard is assigned to this node, but the replica shard is not. Therefore the cluster is not in healthy state:
ubuntu@ip-172-31-19-185:~/temp/opensearch-operator$ curl -k https://admin:[xxx]@10.1.41.133:9200/_cluster/health
{
"cluster_name": "opensearch-sjf4",
"status": "yellow",
"timed_out": false,
"number_of_nodes": 1,
"number_of_data_nodes": 1,
"discovered_master": true,
"discovered_cluster_manager": true,
"active_primary_shards": 5,
"active_shards": 5,
"relocating_shards": 0,
"initializing_shards": 0,
"unassigned_shards": 1,
"delayed_unassigned_shards": 0,
"number_of_pending_tasks": 0,
"number_of_in_flight_fetch": 0,
"task_max_waiting_in_queue_millis": 0,
"active_shards_percent_as_number": 83.33333333333334
}
ubuntu@ip-172-31-19-185:~/temp/opensearch-operator$ curl -k https://admin:[xxx]@10.1.41.133:9200/_cat/shards
test-index 0 p STARTED 0 208b 10.1.41.133 opensearch-0.db0
test-index 0 r UNASSIGNED
[...]
This can be solved by adding another node to the deployment. After settling, the unassigned replica shard will be assigned to this node, and the cluster will be healthy:
ubuntu@ip-172-31-19-185:~/temp/opensearch-operator$ juju add-unit opensearch
ubuntu@ip-172-31-19-185:~/temp/opensearch-operator$ curl -k https://admin:[xxx]@10.1.41.133:9200/_cat/shards
test-index 0 p STARTED 0 208b 10.1.41.133 opensearch-0.db0
test-index 0 r STARTED 0 208b 10.1.41.119 opensearch-1.db0
[...]
ubuntu@ip-172-31-19-185:~/temp/opensearch-operator$ juju status
Model Controller Cloud/Region Version SLA Timestamp
opensearch dev-controller localhost/localhost 3.1.9 unsupported 09:28:23Z
App Version Status Scale Charm Channel Rev Exposed Message
data-integrator active 1 data-integrator edge 43 no
opensearch active 2 opensearch 2/beta 117 no
self-signed-certificates active 1 self-signed-certificates stable 155 no
Unit Workload Agent Machine Public address Ports Message
data-integrator/0* active idle 2 10.1.41.25
opensearch/0* active idle 0 10.1.41.133 9200/tcp
opensearch/1 active idle 3 10.1.41.119 9200/tcp
self-signed-certificates/0* active idle 1 10.1.41.219
More information on this can be found here: https://charmhub.io/opensearch/docs/t-horizontal-scaling
Please let us know if this works for you, then we can close this issue.
Steps to reproduce
Expected behavior
Opensearch is working.
Actual behavior
Additionally, I get timeout on all operations.
Versions
Operating system: Ubuntu 22.04
Juju CLI: 3.5.3
Juju agent: 3.5.3
Charm revision: both 2/beta & 2/edge
LXD: nope - using AWS
Log output
Juju debug log: log.txt