canonical / mysql-k8s-operator

A Charmed Operator for running MySQL on Kubernetes
https://charmhub.io/mysql-k8s
Apache License 2.0
8 stars 15 forks source link

mysql-k8s does not create service account when related too early #476

Open gboutry opened 1 month ago

gboutry commented 1 month ago

Steps to reproduce

  1. deploy a plan with a mysql-k8s and a lot of related apps
  2. wait

In single node mode, Sunbeam deploys 1 MySQL with 10 MySQL routers and 8 related apps. It is created by a terraform plan that will most likely create the integration as soon as both app are considered created by Juju. This can happen way before a charm is ready to serve relation requests, but this is independent of terraform.

Not reproducible 100%, but it does happen

Expected behavior

Mysql-k8s should create account for all related services

Actual behavior

Mysql-k8s does not create service accounts

Log output

Captured from a CI run Juju debug log:

2024-08-05T06:37:04.065Z [pebble] HTTP API server listening on ":38812".
2024-08-05T06:37:04.066Z [pebble] Started daemon.
2024-08-05T06:37:04.089Z [pebble] POST /v1/services 12.479793ms 202
2024-08-05T06:37:04.097Z [pebble] Service "container-agent" starting: /charm/bin/containeragent unit --data-dir /var/lib/juju --append-env "PATH=$PATH:/charm/bin" --show-log --charm-modified-version 0
2024-08-05T06:37:04.359Z [container-agent] 2024-08-05 06:37:04 INFO juju.cmd supercommand.go:56 running containerAgent [3.5.3 63d460f9ee6c7c710131961390687e7a0ab90470 gc go1.21.12]
2024-08-05T06:37:04.365Z [container-agent] starting containeragent unit command
2024-08-05T06:37:04.371Z [container-agent] containeragent unit "unit-mysql-0" start (3.5.3 [gc])
2024-08-05T06:37:04.379Z [container-agent] 2024-08-05 06:37:04 INFO juju.cmd.containeragent.unit runner.go:592 start "unit"
2024-08-05T06:37:04.379Z [container-agent] 2024-08-05 06:37:04 INFO juju.worker.upgradesteps worker.go:60 upgrade steps for 3.5.3 have already been run.
2024-08-05T06:37:04.449Z [container-agent] 2024-08-05 06:37:04 INFO juju.worker.probehttpserver server.go:157 starting http server on 127.0.0.1:65301
2024-08-05T06:37:04.476Z [container-agent] 2024-08-05 06:37:04 INFO juju.api apiclient.go:629 connection established to "wss://172.17.0.1:17070/model/32e54cee-388e-41df-8525-1817a6ff4e5a/api"
2024-08-05T06:37:04.684Z [container-agent] 2024-08-05 06:37:04 INFO juju.worker.apicaller connect.go:163 [32e54c] "unit-mysql-0" successfully connected to "172.17.0.1:17070"
2024-08-05T06:37:04.778Z [container-agent] 2024-08-05 06:37:04 INFO juju.worker.migrationminion worker.go:147 migration phase is now: NONE
2024-08-05T06:37:04.817Z [container-agent] 2024-08-05 06:37:04 INFO juju.worker.logger logger.go:120 logger worker started
2024-08-05T06:37:04.936Z [container-agent] 2024-08-05 06:37:04 WARNING juju.worker.proxyupdater proxyupdater.go:241 unable to set snap core settings [proxy.http= proxy.https= proxy.store=]: exec: "snap": executable file not found in $PATH, output: ""
2024-08-05T06:37:05.134Z [pebble] GET /v1/changes/1/wait 1.043649154s 200
2024-08-05T06:37:05.145Z [pebble] Started default services with change 1.
2024-08-05T06:37:05.170Z [container-agent] 2024-08-05 06:37:05 INFO juju.worker.caasupgrader upgrader.go:113 abort check blocked until version event received
2024-08-05T06:37:05.205Z [container-agent] 2024-08-05 06:37:05 INFO juju.worker.caasupgrader upgrader.go:119 unblocking abort check
2024-08-05T06:37:05.209Z [container-agent] 2024-08-05 06:37:05 INFO juju.worker.leadership tracker.go:194 mysql/0 promoted to leadership of mysql
2024-08-05T06:37:05.487Z [container-agent] 2024-08-05 06:37:05 INFO juju.agent.tools symlinks.go:20 ensure jujuc symlinks in /var/lib/juju/tools/unit-mysql-0
2024-08-05T06:37:06.602Z [container-agent] 2024-08-05 06:37:06 INFO juju.worker.uniter uniter.go:363 unit "mysql/0" started
2024-08-05T06:37:06.802Z [container-agent] 2024-08-05 06:37:06 INFO juju.worker.uniter uniter.go:689 resuming charm install
2024-08-05T06:37:06.812Z [container-agent] 2024-08-05 06:37:06 INFO juju.worker.uniter.charm bundles.go:82 downloading ch:amd64/jammy/mysql-k8s-153 from API server
2024-08-05T06:37:06.812Z [container-agent] 2024-08-05 06:37:06 INFO juju.downloader download.go:112 downloading from ch:amd64/jammy/mysql-k8s-153
2024-08-05T06:37:14.082Z [pebble] Check "readiness" failure 1 (threshold 3): received non-20x status code 418
2024-08-05T06:37:24.073Z [pebble] Check "readiness" failure 2 (threshold 3): received non-20x status code 418
2024-08-05T06:37:27.406Z [container-agent] 2024-08-05 06:37:27 INFO juju.downloader download.go:95 download complete ("ch:amd64/jammy/mysql-k8s-153")
2024-08-05T06:37:27.464Z [container-agent] 2024-08-05 06:37:27 INFO juju.downloader download.go:175 download verified ("ch:amd64/jammy/mysql-k8s-153")
2024-08-05T06:37:34.069Z [pebble] Check "readiness" failure 3 (threshold 3): received non-20x status code 418
2024-08-05T06:37:34.069Z [pebble] Check "readiness" failure threshold 3 hit, triggering action
2024-08-05T06:37:44.065Z [pebble] Check "readiness" failure 4 (threshold 3): received non-20x status code 418
2024-08-05T06:37:54.067Z [pebble] Check "readiness" failure 5 (threshold 3): received non-20x status code 418
2024-08-05T06:38:04.073Z [pebble] Check "readiness" failure 6 (threshold 3): received non-20x status code 418
2024-08-05T06:38:14.070Z [pebble] Check "readiness" failure 7 (threshold 3): received non-20x status code 418
2024-08-05T06:38:24.064Z [pebble] Check "readiness" failure 8 (threshold 3): received non-20x status code 418
2024-08-05T06:38:34.077Z [pebble] Check "readiness" failure 9 (threshold 3): received non-20x status code 418
2024-08-05T06:38:44.075Z [pebble] Check "readiness" failure 10 (threshold 3): received non-20x status code 418
2024-08-05T06:38:54.096Z [pebble] Check "readiness" failure 11 (threshold 3): received non-20x status code 418
2024-08-05T06:39:04.068Z [pebble] Check "readiness" failure 12 (threshold 3): received non-20x status code 418
2024-08-05T06:39:14.069Z [pebble] Check "readiness" failure 13 (threshold 3): received non-20x status code 418
2024-08-05T06:39:24.070Z [pebble] Check "readiness" failure 14 (threshold 3): received non-20x status code 418
2024-08-05T06:39:34.069Z [pebble] Check "readiness" failure 15 (threshold 3): received non-20x status code 418
2024-08-05T06:39:44.065Z [pebble] Check "readiness" failure 16 (threshold 3): received non-20x status code 418
2024-08-05T06:39:54.068Z [pebble] Check "readiness" failure 17 (threshold 3): received non-20x status code 418
2024-08-05T06:40:04.070Z [pebble] Check "readiness" failure 18 (threshold 3): received non-20x status code 418
2024-08-05T06:40:14.069Z [pebble] Check "readiness" failure 19 (threshold 3): received non-20x status code 418
2024-08-05T06:40:24.076Z [pebble] Check "readiness" failure 20 (threshold 3): received non-20x status code 418
2024-08-05T06:40:34.067Z [pebble] Check "readiness" failure 21 (threshold 3): received non-20x status code 418
2024-08-05T06:40:44.078Z [pebble] Check "readiness" failure 22 (threshold 3): received non-20x status code 418
2024-08-05T06:40:54.082Z [pebble] Check "readiness" failure 23 (threshold 3): received non-20x status code 418
2024-08-05T06:41:04.068Z [pebble] Check "readiness" failure 24 (threshold 3): received non-20x status code 418
2024-08-05T06:41:14.079Z [pebble] Check "readiness" failure 25 (threshold 3): received non-20x status code 418
2024-08-05T06:41:24.068Z [pebble] Check "readiness" failure 26 (threshold 3): received non-20x status code 418
2024-08-05T06:41:34.072Z [pebble] Check "readiness" failure 27 (threshold 3): received non-20x status code 418
2024-08-05T06:41:44.063Z [pebble] Check "readiness" failure 28 (threshold 3): received non-20x status code 418
2024-08-05T06:41:54.091Z [pebble] Check "readiness" failure 29 (threshold 3): received non-20x status code 418
2024-08-05T06:42:04.066Z [pebble] Check "readiness" failure 30 (threshold 3): received non-20x status code 418
2024-08-05T06:42:14.065Z [pebble] Check "readiness" failure 31 (threshold 3): received non-20x status code 418
2024-08-05T06:42:24.064Z [pebble] Check "readiness" failure 32 (threshold 3): received non-20x status code 418
2024-08-05T06:42:34.064Z [pebble] Check "readiness" failure 33 (threshold 3): received non-20x status code 418
2024-08-05T06:42:34.576Z [container-agent] 2024-08-05 06:42:34 INFO juju.worker.uniter uniter.go:389 hooks are retried true
2024-08-05T06:42:35.819Z [container-agent] 2024-08-05 06:42:35 INFO juju.worker.uniter resolver.go:185 found queued "install" hook
2024-08-05T06:42:44.064Z [pebble] Check "readiness" failure 34 (threshold 3): received non-20x status code 418
2024-08-05T06:42:49.130Z [container-agent] 2024-08-05 06:42:49 INFO juju-log Running legacy hooks/install.
2024-08-05T06:42:54.063Z [pebble] Check "readiness" failure 35 (threshold 3): received non-20x status code 418
2024-08-05T06:42:54.678Z [container-agent] 2024-08-05 06:42:54 INFO juju.worker.uniter.operation runhook.go:186 ran "install" hook (via hook dispatching script: dispatch)
2024-08-05T06:43:04.063Z [pebble] Check "readiness" failure 36 (threshold 3): received non-20x status code 418
2024-08-05T06:43:09.974Z [container-agent] 2024-08-05 06:43:09 INFO juju.worker.uniter.operation runhook.go:186 ran "database-relation-created" hook (via hook dispatching script: dispatch)
2024-08-05T06:43:13.126Z [container-agent] 2024-08-05 06:43:13 INFO juju.worker.uniter.operation runhook.go:186 ran "database-relation-created" hook (via hook dispatching script: dispatch)
2024-08-05T06:43:14.070Z [pebble] Check "readiness" failure 37 (threshold 3): received non-20x status code 418
2024-08-05T06:43:14.956Z [container-agent] 2024-08-05 06:43:14 INFO juju.worker.uniter.operation runhook.go:186 ran "restart-relation-created" hook (via hook dispatching script: dispatch)
2024-08-05T06:43:19.268Z [container-agent] 2024-08-05 06:43:19 INFO juju.worker.uniter.operation runhook.go:186 ran "database-relation-created" hook (via hook dispatching script: dispatch)
2024-08-05T06:43:21.662Z [container-agent] 2024-08-05 06:43:21 INFO juju.worker.uniter.operation runhook.go:186 ran "database-peers-relation-created" hook (via hook dispatching script: dispatch)
2024-08-05T06:43:24.085Z [pebble] Check "readiness" failure 38 (threshold 3): received non-20x status code 418
2024-08-05T06:43:24.377Z [container-agent] 2024-08-05 06:43:24 INFO juju.worker.uniter.operation runhook.go:186 ran "database-relation-created" hook (via hook dispatching script: dispatch)
2024-08-05T06:43:26.560Z [container-agent] 2024-08-05 06:43:26 INFO juju.worker.uniter.operation runhook.go:186 ran "database-relation-created" hook (via hook dispatching script: dispatch)
2024-08-05T06:43:28.936Z [container-agent] 2024-08-05 06:43:28 INFO juju.worker.uniter.operation runhook.go:186 ran "database-relation-created" hook (via hook dispatching script: dispatch)
2024-08-05T06:43:30.776Z [container-agent] 2024-08-05 06:43:30 INFO juju.worker.uniter.operation runhook.go:186 ran "database-relation-created" hook (via hook dispatching script: dispatch)
2024-08-05T06:43:32.948Z [container-agent] 2024-08-05 06:43:32 INFO juju.worker.uniter.operation runhook.go:186 ran "upgrade-relation-created" hook (via hook dispatching script: dispatch)
2024-08-05T06:43:34.066Z [pebble] Check "readiness" failure 39 (threshold 3): received non-20x status code 418
2024-08-05T06:43:34.943Z [container-agent] 2024-08-05 06:43:34 INFO juju.worker.uniter.operation runhook.go:186 ran "database-relation-created" hook (via hook dispatching script: dispatch)
2024-08-05T06:43:36.772Z [container-agent] 2024-08-05 06:43:36 INFO juju.worker.uniter.operation runhook.go:186 ran "database-relation-created" hook (via hook dispatching script: dispatch)
2024-08-05T06:43:38.827Z [container-agent] 2024-08-05 06:43:38 INFO juju.worker.uniter.operation runhook.go:186 ran "database-relation-created" hook (via hook dispatching script: dispatch)
2024-08-05T06:43:39.026Z [container-agent] 2024-08-05 06:43:39 INFO juju.worker.uniter resolver.go:185 found queued "leader-elected" hook
2024-08-05T06:43:42.089Z [container-agent] 2024-08-05 06:43:42 INFO juju.worker.uniter.operation runhook.go:186 ran "leader-elected" hook (via hook dispatching script: dispatch)
2024-08-05T06:43:44.076Z [pebble] Check "readiness" failure 40 (threshold 3): received non-20x status code 418
2024-08-05T06:43:44.118Z [container-agent] 2024-08-05 06:43:44 INFO juju-log Setting up the logrotate configurations
2024-08-05T06:43:54.064Z [pebble] Check "readiness" failure 41 (threshold 3): received non-20x status code 418
2024-08-05T06:44:04.104Z [pebble] Check "readiness" failure 42 (threshold 3): received non-20x status code 418
2024-08-05T06:44:08.818Z [container-agent] 2024-08-05 06:44:08 INFO juju-log Configuring instance
2024-08-05T06:44:14.065Z [pebble] Check "readiness" failure 43 (threshold 3): received non-20x status code 418
2024-08-05T06:44:15.834Z [container-agent] 2024-08-05 06:44:15 INFO juju-log Creating cluster cluster-fde4a26ba6a4a08320c36e7bac7734d6
2024-08-05T06:44:24.134Z [pebble] Check "readiness" failure 44 (threshold 3): received non-20x status code 418
2024-08-05T06:44:34.064Z [pebble] Check "readiness" failure 45 (threshold 3): received non-20x status code 418
2024-08-05T06:44:36.474Z [container-agent] 2024-08-05 06:44:36 INFO juju.worker.uniter.operation runhook.go:186 ran "mysql-pebble-ready" hook (via hook dispatching script: dispatch)
2024-08-05T06:44:37.648Z [container-agent] 2024-08-05 06:44:37 INFO juju-log Starting the log rotate manager
2024-08-05T06:44:37.675Z [container-agent] 2024-08-05 06:44:37 INFO juju-log Started log rotate manager process with PID 2833
2024-08-05T06:44:38.114Z [container-agent] 2024-08-05 06:44:38 INFO juju.worker.uniter.operation runhook.go:186 ran "database-storage-attached" hook (via hook dispatching script: dispatch)
2024-08-05T06:44:40.137Z [container-agent] 2024-08-05 06:44:40 INFO juju.worker.uniter.operation runhook.go:186 ran "config-changed" hook (via hook dispatching script: dispatch)
2024-08-05T06:44:40.172Z [container-agent] 2024-08-05 06:44:40 INFO juju.worker.uniter resolver.go:185 found queued "start" hook
2024-08-05T06:44:41.171Z [container-agent] 2024-08-05 06:44:41 INFO juju-log Running legacy hooks/start.
2024-08-05T06:44:42.694Z [container-agent] 2024-08-05 06:44:42 INFO juju.worker.uniter.operation runhook.go:186 ran "start" hook (via hook dispatching script: dispatch)
2024-08-05T06:44:45.748Z [container-agent] 2024-08-05 06:44:45 INFO juju-log database:47: Kubernetes service mysql-primary created
2024-08-05T06:44:45.892Z [container-agent] 2024-08-05 06:44:45 INFO juju-log database:47: Kubernetes service mysql-replicas created
2024-08-05T06:44:47.309Z [container-agent] 2024-08-05 06:44:47 INFO juju-log database:47: Created user for app neutron-mysql-router
2024-08-05T06:44:47.934Z [container-agent] 2024-08-05 06:44:47 INFO juju.worker.uniter.operation runhook.go:186 ran "database-relation-changed" hook (via hook dispatching script: dispatch)
2024-08-05T06:44:51.036Z [container-agent] 2024-08-05 06:44:51 WARNING juju-log database:91: Kubernetes service already exists

Additional context

neutron-mysql-router has got the same flow of backend-database events, except for the additional one signaling the database has been created. In this run, Neutron is the only charm that got a complete database relation.

github-actions[bot] commented 1 month ago

https://warthogs.atlassian.net/browse/DPE-5037

shayancanonical commented 1 week ago

With the merge of PR 499, we believe we may have addressed one possible conditions that would cause the reported issue. With the new 8.0/edge, we hope that this issue is reduced in frequency, if not eliminated

@gboutry have you seen this recently? without concrete steps to reproduce, we're not sure how to proceed