Closed paulomach closed 4 months ago
If this really helps, why error state still failing this.
Was it permanent error state and not temporary one?
Hi guys, here to provide some context for why this change was adopted over on PostgreSQL side:
If this really helps, why error state still failing this.
@taurus-forever It seems the change was missing in this particular wait_for_idle
: https://github.com/canonical/mysql-k8s-operator/blob/ea997403bb0689886e7f5ac55888535d8304eb6c/tests/integration/test_charm.py#L65C1-L71C10
I am a little worried about the charms going into error state (for reasons other than PVC creation), and then eventually update_status coming along and setting the status becoming active from retries and us not catching this case due to the raise_on_error=False
@shayancanonical Me and Dragomir (who originally proposed this change in the first place on my PR, I had a different approach at first) discussed the implications of not raising errors on the wait_for_idle
calls, the main point being: It is not that we'll completely ignore the errors, it is just that they won't get raised in the very first wait_for_idle
after charm deploy. Sure, transient errors like the PVC one might get ignored completely, but persistent errors will either get the wait_for_idle
to timeout due to charm never getting active
status, or something breaking further ahead in the test. About the status becoming active from retries
, wouldn't this mean the charm actually recovered during said retries? Unless the hooks are setting active
status when they shouldn't.
In any case, this is not a proper fix by any means, but more of a workaround to avoid annoying transient errors right after charm deploy on CI.
I am a little worried about the charms going into error state (for reasons other than PVC creation), and then eventually ~update_status coming along and setting~ the status becoming
active
from retries and us not catching this case due to theraise_on_error=False
As Lucas said, we will still will catch the errors, but when timeout is reached. IMO if we can reach a better stability, It's a valid workaround.
Issue
Integration tests often fails on a PVC failing to be created
Solution
Workaround it by not raise error on wait_for_idle when run after mysql-k8s deploy/scale commands (i.e. when PVC is created)
Credits to @lucasgameiroborges
Fixes #378