canonical / mongodb-operator

Charmed solution for MongoDB
Apache License 2.0
12 stars 14 forks source link

Juju secrets updated at a fast pace #470

Closed kot0dama closed 2 months ago

kot0dama commented 2 months ago

Steps to reproduce

  1. Deploy mongodb charm channel=6/edge, revision=173
  2. Let it run for a while (see following comments by the team responsible for running the application itself)

Expected behavior

Juju secrets are not updated too frequently.

Actual behavior

Juju secrets are updated too eagerly resulting in a high number of revisions for the secret. Per the juju team, a number of revisions around 5k is the sign the charm is updating the secret too eagerly, probably on every single event. This is causing the underlying juju controller's mongod server itself to hog on CPU as it's being overloaded with revisions of the juju secret to check for.

The Juju controller's grafana dashboard shows query operation steady increase and secretRevisions collection read rate since around the deployment of this charm in the stg-github-runner-mq juju model.

Versions

Operating system: Ubuntu 22.04.4 LTS Juju CLI: 3.4.5-ubuntu-amd64 Juju agent: 3.4.5 Charm revision: 173 LXD: -

Log output

Juju secrets details (see the revision count):

$ juju secrets -m stg-github-runner-mq --format yaml
cqf4hnipb3do3hjt3qa0:
  revision: 2
  owner: mongodb
  label: mongodb.app
  created: 2024-07-22T11:56:14Z
  updated: 2024-07-22T11:56:29Z
cqospoq3t07lr7h9ctv0:
  revision: 6187
  owner: mongodb
  label: database.5.user.secret
  created: 2024-08-06T07:12:38Z
  updated: 2024-08-29T00:51:22Z
cqqrm6bvnated3p58bbg:
  revision: 5326
  owner: mongodb
  label: database.6.user.secret
  created: 2024-08-09T06:45:48Z
  updated: 2024-08-29T00:53:26Z

Juju debug log (excerpt):

unit-mongodb-1: 02:10:47 INFO juju.worker.uniter.operation ran "update-status" hook (via hook dispatching script: dispatch)
unit-mongodb-1: 02:11:01 DEBUG unit.mongodb/1.juju-log ops 2.12.0 up and running.
unit-mongodb-1: 02:11:01 DEBUG unit.mongodb/1.juju-log Emitting Juju event secret_changed.
unit-mongodb-1: 02:11:01 DEBUG unit.mongodb/1.juju-log Secret secret:cqospoq3t07lr7h9ctv0 changed, but it's unknown
unit-mongodb-1: 02:11:01 INFO juju.worker.uniter.operation ran "secret-changed" hook (via hook dispatching script: dispatch)
unit-mongodb-1: 02:11:04 DEBUG unit.mongodb/1.juju-log ops 2.12.0 up and running.
unit-mongodb-1: 02:11:04 DEBUG unit.mongodb/1.juju-log Emitting Juju event secret_changed.
unit-mongodb-1: 02:11:04 DEBUG unit.mongodb/1.juju-log Secret secret:cqqrm6bvnated3p58bbg changed, but it's unknown
unit-mongodb-1: 02:11:04 INFO juju.worker.uniter.operation ran "secret-changed" hook (via hook dispatching script: dispatch)
unit-mongodb-1: 02:11:06 DEBUG unit.mongodb/1.juju-log ops 2.12.0 up and running.
unit-mongodb-1: 02:11:06 DEBUG unit.mongodb/1.juju-log Emitting Juju event secret_remove.
unit-mongodb-1: 02:11:06 ERROR unit.mongodb/1.juju-log _on_secret_remove: Secret secret:cqqrm6bvnated3p58bbg seems to have no observers, could be removed
unit-mongodb-1: 02:11:06 INFO juju.worker.uniter.operation ran "secret-remove" hook (via hook dispatching script: dispatch)
unit-mongodb-1: 02:11:09 DEBUG unit.mongodb/1.juju-log ops 2.12.0 up and running.
unit-mongodb-1: 02:11:09 DEBUG unit.mongodb/1.juju-log Emitting Juju event secret_remove.
unit-mongodb-1: 02:11:09 ERROR unit.mongodb/1.juju-log _on_secret_remove: Secret secret:cqqrm6bvnated3p58bbg seems to have no observers, could be removed
unit-mongodb-1: 02:11:09 INFO juju.worker.uniter.operation ran "secret-remove" hook (via hook dispatching script: dispatch)
unit-mongodb-1: 02:11:11 DEBUG unit.mongodb/1.juju-log ops 2.12.0 up and running.
unit-mongodb-1: 02:11:11 DEBUG unit.mongodb/1.juju-log Emitting Juju event secret_remove.
unit-mongodb-1: 02:11:11 ERROR unit.mongodb/1.juju-log _on_secret_remove: Secret secret:cqospoq3t07lr7h9ctv0 seems to have no observers, could be removed

Additional context

syncronize-issues-to-jira[bot] commented 2 months ago

Thank you for reporting us your feedback!

The internal ticket has been created: https://warthogs.atlassian.net/browse/DPE-5297.

This message was autogenerated

kot0dama commented 2 months ago

cc @cbartz (per https://github.com/canonical/infrastructure-services/pull/2040)

cbartz commented 2 months ago

To provide more context about the deployment where the problem occurred, mongodb is integrated with another charm using

mongodb:database                                github-runner-webhook-router:mongodb         mongodb_client           regular  

(see https://pastebin.canonical.com/p/jBfdRKtpFz/).

Things to note:

Some juju debug-log's of the requirer model: https://pastebin.canonical.com/p/2tpfVVG4Pc/

kot0dama commented 2 months ago

We (IS) can probably keep the running environment with these secrets running for a week before we need to fix and clear that issue.

MiaAltieri commented 2 months ago

Thank you so much for reporting this - I am going to bring this to the team and we will do our best to address this issue ASAP

MiaAltieri commented 2 months ago

We will likely get to this issue next pulse. When we have this adequately planned I will let you know :)

MiaAltieri commented 2 months ago

@kot0dama + @cbartz Two things :)

  1. Can you update your issue description? You say "Deploy mongodb charm channel=14/stable, revision=173" Do you mean 6/stable? And if you do, the highest revision on 6/stable is 164.

  2. I believe the issue was related to a previous version of the data platform libs, which have been updated + re-released in the newest 6/edge - would you mind trying your tests again with 6/edge?

Note: we plan to update stable later in October, so if you are using stable, then the issue should hopefully be resolved then

cbartz commented 2 months ago

@kot0dama @MiaAltieri its 6/edge revision 173

App      Version  Status  Scale  Charm    Channel  Rev  Exposed  Message
mongodb           active      1  mongodb  6/edge   173  yes      Primary

And of course we can upgrade to the latest edge.

MiaAltieri commented 2 months ago

Thank you very much for the context @cbartz - please let me know how you get on with latest edge, I appreciate your patience :)

kot0dama commented 2 months ago

Sorry for the wrong channel name, I've just fixed that description. I've also approved the MR for updating to latest edge revision, let's hope it fixes it!

kot0dama commented 2 months ago

Requests / load on the controller dropped drastically. Current revision count is:

$ juju secrets --format yaml
cqf4hnipb3do3hjt3qa0:
  revision: 2
  owner: mongodb
  label: mongodb.app
  created: 2024-07-22T11:56:14Z
  updated: 2024-07-22T11:56:29Z
cqospoq3t07lr7h9ctv0:
  revision: 6948
  owner: mongodb
  label: database.5.user.secret
  created: 2024-08-06T07:12:38Z
  updated: 2024-09-02T00:30:31Z
cqqrm6bvnated3p58bbg:
  revision: 6088
  owner: mongodb
  label: database.6.user.secret
  created: 2024-08-09T06:45:48Z
  updated: 2024-09-02T00:32:58Z

@cbartz would you mind monitoring / checking in a few days if the number of revisions increases please? Thank you!

cbartz commented 2 months ago

Requests / load on the controller dropped drastically. Current revision count is:

$ juju secrets --format yaml
cqf4hnipb3do3hjt3qa0:
  revision: 2
  owner: mongodb
  label: mongodb.app
  created: 2024-07-22T11:56:14Z
  updated: 2024-07-22T11:56:29Z
cqospoq3t07lr7h9ctv0:
  revision: 6948
  owner: mongodb
  label: database.5.user.secret
  created: 2024-08-06T07:12:38Z
  updated: 2024-09-02T00:30:31Z
cqqrm6bvnated3p58bbg:
  revision: 6088
  owner: mongodb
  label: database.6.user.secret
  created: 2024-08-09T06:45:48Z
  updated: 2024-09-02T00:32:58Z

@cbartz would you mind monitoring / checking in a few days if the number of revisions increases please? Thank you!

Will do.

cbartz commented 2 months ago

@kot0dama

Looks good so far

stg-github-runner-mq@is-charms-bastion-ps6:~$ juju secrets --format yaml
cqf4hnipb3do3hjt3qa0:
  revision: 2
  owner: mongodb
  label: mongodb.app
  created: 2024-07-22T11:56:14Z
  updated: 2024-07-22T11:56:29Z
cqospoq3t07lr7h9ctv0:
  revision: 6948
  owner: mongodb
  label: database.5.user.secret
  created: 2024-08-06T07:12:38Z
  updated: 2024-09-02T00:30:31Z
cqqrm6bvnated3p58bbg:
  revision: 6088
  owner: mongodb
  label: database.6.user.secret
  created: 2024-08-09T06:45:48Z
  updated: 2024-09-02T00:32:58Z
kot0dama commented 2 months ago

Great then, hopefully the latest revision is fixed!