Open natalian98 opened 2 months ago
Thank you for reporting us your feedback!
The internal ticket has been created: https://warthogs.atlassian.net/browse/DPE-5494.
This message was autogenerated
Hi, @natalian98, the blocked status should be caused by:
2024-09-19T09:13:43.4942715Z unit-postgresql-k8s-0: 09:13:33 ERROR unit.postgresql-k8s/0.juju-log Failed to disable plugin: cannot drop extension pg_trgm because other objects depend on it
2024-09-19T09:13:43.4943285Z DETAIL: index identity_credential_identifiers_nid_identifier_gin depends on operator class gin_trgm_ops for access method gin
2024-09-19T09:13:43.4943489Z HINT: Use DROP ... CASCADE to drop the dependent objects too.
2024-09-19T09:13:43.4943496Z
2024-09-19T09:13:43.4944042Z Was the plugin enabled manually? If so, update charm config with `juju config postgresql-k8s plugin_<plugin_name>_enable=True`
2024-09-19T09:13:43.4944615Z unit-postgresql-k8s-0: 09:13:33 DEBUG unit.postgresql-k8s/0.juju-log on_update_status early exit: Unit is in Blocked/Waiting status
Do you know if any of your components manually enables pg_trgm? Have you tried setting the plugin_pg_trgm_enable
config in the bundle?
Hi @dragomirp, thanks for your fast reply.
The charms don't enable pg_trgm
, but the upstream kratos component creates the extension on db migration, if I understand correctly.
That doesn't explain though why this only happens on some runs? If that was the issue, the tests would fail consistently because the database is always migrated on a fresh deployment. Could there be some racing condition in setting the unit status in postgresql-k8s?
Hi, this check should be happening in the update status hook, so I would guess that sometimes the test manages to exit before the Postgresql charm manages to block.
Can you try to enable the plugin in the bundle and see if the issue persists?
@dragomirp I tried enabling the plugin and one of two runs failed again: https://github.com/canonical/iam-bundle/actions/runs/10941912483/job/30377673936#step:4:681
Hi, @natalian98, looks like there are more plugins required:
2024-09-19T14:08:17.8743192Z Was the plugin enabled manually? If so, update charm config with `juju config postgresql-k8s plugin_<plugin_name>_enable=True`
2024-09-19T14:08:17.8745467Z unit-postgresql-k8s-0: 13:39:43 ERROR unit.postgresql-k8s/0.juju-log Failed to disable plugin: cannot drop extension btree_gin because other objects depend on it
2024-09-19T14:08:17.8747604Z DETAIL: index identity_credential_identifiers_nid_identifier_gin depends on operator class uuid_ops for access method gin
This should be enabled by plugin_btree_gin_enable
flag.
You can check for missing plugins in the debug log step of the run: https://github.com/canonical/iam-bundle/actions/runs/10941912483/job/30377673936#step:14:4694
Hi @dragomirp, that solved the issue, thanks a lot!
Suggestion: perhaps the status could be set on a different event than update-status
? Some teams set this hook interval to 1h in tests, so they may not find out that some plugin is missing
Glad it worked out.
I'll discuss it with the rest of the team, but I don't think there is a more appropriate event, since we can't know when extensions are enabled manually. Polling periodically on update-status
seems to be the most concise way to verify there's no mismatch between declared plugins and usage.
Steps to reproduce
This only happens once per couple of test runs, hence is difficult to reproduce, but you can do so by running:
Expected behavior
Postgresql-k8s app and unit get active.
Actual behavior
At times
postgresql-k8s
unit gets stuck in blocked state, causing our bundle tests to fail. These are runs from this week: https://github.com/canonical/iam-bundle/actions/runs/10880493814/job/30244416338#step:4:667 https://github.com/canonical/iam-bundle/actions/runs/10937282732/job/30362792991 https://github.com/canonical/iam-bundle/actions/runs/10900084647/job/30246933947However, we don't enable or disable any plugins in the charms integrated with the database (kratos and hydra). Could you advise what could be causing this?
Versions
Operating system: ubuntu 22.04
Juju CLI: 3.4/stable
Juju agent: 3.4.5
Charm revision: 381
microk8s: 1.27 and 1.28/stable
Log output
Juju debug log: https://github.com/canonical/iam-bundle/actions/runs/10880493814/job/30244416338#step:14:1
Additional context
We've been deploying postgresql-k8s from
14/stable
channel. So far the tests run successfully when it pointed to rev281, we're experiencing this flaky issue since it was promoted to rev381.