canonical / mongodb-k8s-operator

Operator charm for MongoDB
Apache License 2.0
13 stars 15 forks source link

One of mongodb-k8s node blocked in "(not reachable/healthy)" state #251

Closed jeffreychang911 closed 1 month ago

jeffreychang911 commented 2 months ago

Steps to reproduce

  1. Deploy 3 mongodb-k8s, relate to self-signed-certificates and data-integrator.

Expected behavior

mongodb-k8s charm should settle in few minutes.

Actual behavior

We waited 1 hr, and that instance is blocked.

Versions

Operating system:

Juju CLI: 3.5.2

Juju agent: 3.5.2

Charm revision: mongodb-k8s rev 41 from 6/edge channel.

microk8s: charmed k8s 1.28 on AWS

Log output

Juju debug log:

unit-mongodb-k8s-0: 2024-07-10 09:12:32 ERROR unit.mongodb-k8s/0.juju-log certificates:1: Non-existing secret unit:cert-secret was attempted to be removed.
unit-mongodb-k8s-0: 2024-07-10 09:12:32 INFO unit.mongodb-k8s/0.juju-log certificates:1: Certificate request sent to provider
unit-mongodb-k8s-0: 2024-07-10 09:12:32 INFO juju.worker.uniter.operation ran "certificates-relation-joined" hook (via hook dispatching script: dispatch)
unit-mongodb-k8s-1: 2024-07-10 09:12:33 ERROR unit.mongodb-k8s/1.juju-log certificates:1: Non-existing secret unit:cert-secret was attempted to be removed.
unit-mongodb-k8s-1: 2024-07-10 09:12:33 INFO unit.mongodb-k8s/1.juju-log certificates:1: Certificate request sent to provider
unit-mongodb-k8s-0: 2024-07-10 09:12:33 INFO juju.worker.uniter.operation ran "database-peers-relation-changed" hook (via hook dispatching script: dispatch)
unit-self-signed-certificates-0: 2024-07-10 09:12:33 INFO unit.self-signed-certificates/0.juju-log certificates:1: Generated certificate for relation 1
unit-mongodb-k8s-1: 2024-07-10 09:12:33 INFO juju.worker.uniter.operation ran "certificates-relation-joined" hook (via hook dispatching script: dispatch)
unit-mongodb-k8s-2: 2024-07-10 09:12:33 ERROR unit.mongodb-k8s/2.juju-log database-peers:0: Failed to create the operator user: non-zero exit code 1 executing ['/usr/bin/mongosh', 'mongodb://localhost/admin', '--quiet', '--eval', "db.createUser({  user: 'operator',  pwd: passwordPrompt(),  roles:[    {'role': 'userAdminAnyDatabase', 'db': 'admin'},     {'role': 'readWriteAnyDatabase', 'db': 'admin'},     {'role': 'clusterAdmin', 'db': 'admin'},   ],  mechanisms: ['SCRAM-SHA-256'],  passwordDigestor: 'server',})"], stdout='Enter password\n********************************', stderr='MongoServerError: not primary\n'
Traceback (most recent call last):
  File "/var/lib/juju/agents/unit-mongodb-k8s-2/charm/./src/charm.py", line 709, in _init_operator_user
    process.wait_output()
  File "/var/lib/juju/agents/unit-mongodb-k8s-2/charm/venv/ops/pebble.py", line 1571, in wait_output
    raise ExecError[AnyStr](self._command, exit_code, out_value, err_value)
ops.pebble.ExecError: non-zero exit code 1 executing ['/usr/bin/mongosh', 'mongodb://localhost/admin', '--quiet', '--eval', "db.createUser({  user: 'operator',  pwd: passwordPrompt(),  roles:[    {'role': 'userAdminAnyDatabase', 'db': 'admin'},     {'role': 'readWriteAnyDatabase', 'db': 'admin'},     {'role': 'clusterAdmin', 'db': 'admin'},   ],  mechanisms: ['SCRAM-SHA-256'],  passwordDigestor: 'server',})"], stdout='Enter password\n********************************', stderr='MongoServerError: not primary\n'

Additional context

This is from SolQA run - https://solutions.qa.canonical.com/testruns/4cbee25f-c1e0-445b-af74-3164cab6e27f juju crashdump - https://oil-jenkins.canonical.com/artifacts/4cbee25f-c1e0-445b-af74-3164cab6e27f/generated/generated/mongodb-k8s/crashdump-2024-07-10-10.12.12.tar.gz

github-actions[bot] commented 2 months ago

https://warthogs.atlassian.net/browse/DPE-4864