Closed sed-i closed 10 months ago
Is this how you'd expect collect-status to be used in components such as libs, @benhoyt @tonyandrewmeyer?
@sed-i Sort of ... though it looks like in this case the charm lib is firing a custom patch_failed
event and has is_ready()
and last_error
methods/attributes. So you probably wouldn't actually do this in the charm lib at all. Keep that nice and stand-alone, and leave the status handling to the charm itself. So the charm would do something like this:
class SomeCharm(ops.CharmBase):
def __init__(self, *args):
# ...
self.resources_patch = KubernetesComputeResourcesPatch(...)
self.framework.observe(charm.on.collect_unit_status, self._on_collect_unit_status)
def _on_collect_unit_status(self, event):
if self.resources_patch.is_ready():
event.add_status(ops.ActiveStatus())
elif self.resources_patch.last_error:
event.add_status(ops.BlockedStatus(self.resources_patch.last_error))
else:
event.add_status(ops.MaintenanceStatus('waiting for patch'))
Note that because KubernetesComputeResourcesPatch
has a last_error
, you could just use this directly, instead of listening to patch_failed
.
That would be my suggestion, at any rate.
Thanks @benhoyt! The custom event is how we currently propagate status updates. Certainly that part won't be needed with collect-status. I tend to agree that it's probably best to have only the charm use add_status.
One slightly unfortunate thing to note about collect-status is that your charm has to be "all in" with it. You can't really switch half of your charm to use collect-status / add_status and leave the rest using unit.status = x
, because the unit.status
assignment will be overwritten with the (partially-implemented) collect-status status.
I don't mind this approach, but it feels a bit awkward in that is_ready would return False iff a last error has been set, which means that if you get a False, then you have to go search for a last error.
Not sure what you mean. Could you point to a line in code?
I don't mind this approach, but it feels a bit awkward in that is_ready would return False iff a last error has been set, which means that if you get a False, then you have to go search for a last error.
Not sure what you mean. Could you point to a line in code?
Can't point to a line in the code because it's about the API it exposes. IIUC the current API implies that, as a user of this lib, the charm would:
lib = Lib()
if lib.is_ready():
# go about your happy path
else:
# retrieve the error msg
error = lib.last_error
Which I think is a bit awkward. An alternative I was proposing is:
lib = Lib()
status: Success | Failure = lib.get_status()
if isinstance(status, Success):
# go about your happy path
else: # Failure
# retrieve the error msg
error = status.message
I realize this deviates from the nice is_ready pattern we've been gliding so far, but fact is now we're introducing a new piece of data: it's not just 'are you ready? --> YES | NO', now it's: 'are you ready, and if not, why? --> YES | NO (because)'
And if we plan to build this out to all other is_ready methods we have in our charm libs, it might be worth it to figure out a generalization-worthy pattern.
That is a good point. What immediately comes to mind is the result package.
But I am not sure we want to couple readiness with errors:
last_error
is set during operations, not necessarily during "get status", so we would need that variable anyway (perhaps prepend with an underscore, _last_error
). Which means later on we could hide this behind a "get status" if we choose to.last_error
gets reset per charm reinit so if an issue is resolved during custom hooks, is_ready could be True, but last_error
still holds the past.That is a good point. What immediately comes to mind is the result package.
Ha, never heard of that. Interesting!
But I am not sure we want to couple readiness with errors:
- Could there be a circumstance where we want is_ready to return True, while at the same time having an error (Blocked)?
At charm level, sure, something like the 'degraded' status @jnsgruk was brainstorming about a few Pragues ago, but in this case we're talking about a pattern charm libraries, which are (for the most part, at least our libs) simple enough to either be 'ready' or 'not ready', without intermediate 'ready-but-kind-of-borky' states.
last_error
is set during operations, not necessarily during "get status", so we would need that variable anyway (perhaps prepend with an underscore,_last_error
). Which means later on we could hide this behind a "get status" if we choose to.
Fair enough, I'm game.
- Could "last error" and "is ready" go out of sync?
last_error
gets reset per charm reinit so if an issue is resolved during custom hooks, is_ready could be True, butlast_error
still holds the past.
Which makes me think, last_error may also quickly become stale as it currently stands (independently of is_ready).
if not lib.is_ready():
err = lib.last_error # set because not ready
lib.do_something_to_fix_error()
err2 = err.lib.last_error # still set because it's only cleared on __init__
Which makes me think, last_error may also quickly become stale as it currently stands (independently of is_ready).
I'm not sure we can do anything about it when the lib has multiple entry points (is_ready, on_config_changed -> patch).
Even if I give each error individual attention,
@dataclass
class LibErrors:
obtain_limit: Optional[str] # Error obtaining resource limit from user function
validity: Optional[str] # Resource spec is invalid
config: Optional[str] # ConfigError
api: Optional[str] # ApiError
patch: Optional[str] # Failed applying patch
try:
do_something()
self.errors.obtain_limit = None
except ValueError as e:
msg = f"Failed obtaining resource limit spec: {e}"
self.errors.obtain_limit = msg
They could still be stale for the same reason.
If we stop using custom events, then...
I closed the tandem alertmanager PR for now. Let's try to come up with a solid collect status pattern first. Closing this too, for now.
Issue
Status updates are communicated via custom events.
Solution
Add
last_error
public attribute, to enable charm to collect-status.In tandem with: