canonical / opensearch-operator

OpenSearch operator
Apache License 2.0
9 stars 5 forks source link

hook "leader-elected" fails when adding a unit after scale down to zero units #306

Closed reneradoi closed 3 weeks ago

reneradoi commented 1 month ago

Steps to reproduce

juju add-model opensearch
# apply the kernel parameters required for opensearch
juju model-config --file ./cloudinit-userdata.yaml
juju create-storage-pool opensearch-storage lxd volume-type=standard
juju deploy opensearch -n 2 --channel 2/edge --storage opensearch-data=opensearch-storage,1G,1
juju deploy self-signed-certificates
juju config self-signed-certificates ca-common-name="CN_CA"
juju relate self-signed-certificates opensearch
juju remove-unit opensearch/1
juju remove-unit opensearch/0
juju add-unit opensearch --attach-storage=opensearch-data/0

Expected behavior

The newly added unit should start up without error.

Actual behavior

$ juju status --storage
Model  Controller  Cloud/Region         Version  SLA          Timestamp
dev    opensearch  localhost/localhost  3.1.8    unsupported  06:52:18Z

App                       Version  Status  Scale  Charm                     Channel  Rev  Exposed  Message
opensearch                         active      1  opensearch                           1  no       
self-signed-certificates           active      1  self-signed-certificates  stable    72  no       

Unit                         Workload  Agent  Machine  Public address  Ports  Message
opensearch/2*                error     idle   5        10.27.170.244          hook failed: "leader-elected"
self-signed-certificates/0*  active    idle   2        10.27.170.141          

Machine  State    Address        Inst id        Base          AZ  Message
2        started  10.27.170.141  juju-622e8b-2  ubuntu@22.04      Running
5        started  10.27.170.244  juju-622e8b-5  ubuntu@22.04      Running

Storage Unit  Storage ID         Type        Pool                Mountpoint                   Size     Status    Message
              opensearch-data/1  filesystem  opensearch-storage                               1.0 GiB  detached  
opensearch/2  opensearch-data/0  filesystem  opensearch-storage  /var/snap/opensearch/common  1.0 GiB  attached  

Versions

Operating system: Ubuntu 24.04 LTS, Ubuntu 22.04 LTS Juju CLI: 3.1.8-genericlinux-amd64 Juju agent: 3.1.8 Charm revision: 47 LXD: 5.21.1 LTS

Log output

unit-opensearch-2: 06:53:05 ERROR unit.opensearch/2.juju-log Uncaught exception while in charm code:
Traceback (most recent call last):
  File "/var/lib/juju/agents/unit-opensearch-2/charm/./src/charm.py", line 267, in <module>
    main(OpenSearchOperatorCharm)
  File "/var/lib/juju/agents/unit-opensearch-2/charm/venv/ops/main.py", line 544, in main
    manager.run()
  File "/var/lib/juju/agents/unit-opensearch-2/charm/venv/ops/main.py", line 520, in run
    self._emit()
  File "/var/lib/juju/agents/unit-opensearch-2/charm/venv/ops/main.py", line 509, in _emit
    _emit_charm_event(self.charm, self.dispatcher.event_name)
  File "/var/lib/juju/agents/unit-opensearch-2/charm/venv/ops/main.py", line 143, in _emit_charm_event
    event_to_emit.emit(*args, **kwargs)
  File "/var/lib/juju/agents/unit-opensearch-2/charm/venv/ops/framework.py", line 352, in emit
    framework._emit(event)
  File "/var/lib/juju/agents/unit-opensearch-2/charm/venv/ops/framework.py", line 851, in _emit
    self._reemit(event_path)
  File "/var/lib/juju/agents/unit-opensearch-2/charm/venv/ops/framework.py", line 941, in _reemit
    custom_handler(event)
  File "/var/lib/juju/agents/unit-opensearch-2/charm/lib/charms/opensearch/v0/opensearch_base_charm.py", line 302, in _on_leader_elected
    self._put_or_update_internal_user_leader(user)
  File "/var/lib/juju/agents/unit-opensearch-2/charm/lib/charms/opensearch/v0/opensearch_base_charm.py", line 1244, in _put_or_update_internal_user_leader
    self.user_manager.update_user_password(user, hashed_pwd)
  File "/var/lib/juju/agents/unit-opensearch-2/charm/lib/charms/opensearch/v0/opensearch_users.py", line 268, in update_user_password
    resp = self.opensearch.request(
  File "/var/lib/juju/agents/unit-opensearch-2/charm/lib/charms/opensearch/v0/opensearch_distro.py", line 266, in request
    raise OpenSearchHttpError(
charms.opensearch.v0.opensearch_exceptions.OpenSearchHttpError: HTTP error self.response_code=None
self.response_text='Host 10.27.170.244:9200 and alternative_hosts: [] not reachable.'
unit-opensearch-4: 06:53:06 ERROR juju.worker.uniter.operation hook "leader-elected" (via hook dispatching script: dispatch) failed: exit status 1

Additional context

I assume the issue is with security_index_initialised, this is not in the peer data anymore:

$ jhack show-relation opensearch:opensearch-peers opensearch:opensearch-peers
                                                                                             relation data v0.6                                                                                             
┏━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ peer relation (id: 2) ┃ opensearch                                                                                                                                                                       ┃
┡━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ type                  │ peer                                                                                                                                                                             │
│ interface             │ opensearch_peers                                                                                                                                                                 │
│ model                 │ the current model                                                                                                                                                                │
│ relation ID           │ 2                                                                                                                                                                                │
│ endpoint              │ opensearch-peers                                                                                                                                                                 │
│ leader unit           │ 2                                                                                                                                                                                │
├───────────────────────┼──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ application data      │ ╭──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮ │
│                       │ │                                                                                                                                                                              │ │
│                       │ │  admin_user_initialized                     True                                                                                                                             │ │
│                       │ │  allocation-exclusions-to-delete            ,opensearch-2                                                                                                                    │ │
│                       │ │  delete-voting-exclusions                   True                                                                                                                             │ │
│                       │ │  deployment-description                     {"config": {"cluster_name": "opensearch-attz", "init_hold": false, "roles": [], "data_temperature": null}, "start":              │ │
│                       │ │                                             "start-with-generated-roles", "pending_directives": [], "typ": "main-orchestrator", "app": "opensearch", "state": {"value":      │ │
│                       │ │                                             "active", "message": ""}, "promotion_time": 1716446675.797672}                                                                   │ │
│                       │ │  opensearch:app:admin-password              secret://d95bf0dc-53cc-4a8c-8f9e-538bd7622e8b/cp7ebls8c16j9paghi7g                                                               │ │
│                       │ │  opensearch:app:admin-password-hash         secret://d95bf0dc-53cc-4a8c-8f9e-538bd7622e8b/cp7ebls8c16j9paghi80                                                               │ │
│                       │ │  opensearch:app:app-admin                   secret://d95bf0dc-53cc-4a8c-8f9e-538bd7622e8b/cp7eblc8c16j9paghi50                                                               │ │
│                       │ │  opensearch:app:kibanaserver-password       secret://d95bf0dc-53cc-4a8c-8f9e-538bd7622e8b/cp7eblk8c16j9paghi6g                                                               │ │
│                       │ │  opensearch:app:kibanaserver-password-hash  secret://d95bf0dc-53cc-4a8c-8f9e-538bd7622e8b/cp7eblk8c16j9paghi70                                                               │ │
│                       │ │  opensearch:app:monitor-password            secret://d95bf0dc-53cc-4a8c-8f9e-538bd7622e8b/cp7ec248c16j9paghib0                                                               │ │
│                       │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ │
│ unit data             │ ╭─ opensearch/opensearch/2 ──────────────────────────────────────────────────────────────────────────────╮                                                                       │
│                       │ │                                                                                                        │                                                                       │
│                       │ │  opensearch:unit:2:unit-http       secret://d95bf0dc-53cc-4a8c-8f9e-538bd7622e8b/cp7eevc8c16j9paghic0  │                                                                       │
│                       │ │  opensearch:unit:2:unit-transport  secret://d95bf0dc-53cc-4a8c-8f9e-538bd7622e8b/cp7eevc8c16j9paghibg  │                                                                       │
│                       │ ╰────────────────────────────────────────────────────────────────────────────────────────────────────────╯                                                                       │
└───────────────────────┴──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

This is where an adjustment might be necessary: https://github.com/canonical/opensearch-operator/blob/main/lib/charms/opensearch/v0/opensearch_base_charm.py#L271

github-actions[bot] commented 1 month ago

https://warthogs.atlassian.net/browse/DPE-4415

reneradoi commented 3 weeks ago

Resolved with https://github.com/canonical/opensearch-operator/pull/272