canonical / openstack-exporter-operator

The openstack-exporter-operator is a machine charm for openstack-exporter.
Apache License 2.0
1 stars 7 forks source link

Raise errors on unrecoverable problems #110

Closed samuelallan72 closed 2 months ago

samuelallan72 commented 2 months ago

Raise errors for snap install or refresh failure. Raise an error because it's a fatal failure, and juju will auto-retry the hook, which may succeed on retry if the error was transient.

Improve blocked status message for if it detects snap not installed or service not active in the status hook. I think Blocked is better than Error status here:

Partially fixes: #108

samuelallan72 commented 2 months ago

Thanks for reviews everyone - I see this is early stage still, I probably could have left it as a draft.

samuelallan72 commented 2 months ago

Ok lints and unit tests fixed, ready for review again.

A downside I see of raising errors instead of using a custom blocked status, is that the juju status message is not as useful:

openstack-exporter/0*  error     idle   0        10.168.106.56             hook failed: "update-status"

I'm not sure if we can improve that while still following this pattern of raising the error and letting the ops framework manage the error status.

samuelallan72 commented 2 months ago

Actually I think this needs more work on the conditions to check if the exporter is running. I just observed this:

openstack-exporter/0*  error     idle   1        10.168.106.220            hook failed: "credentials-relation-joined"

I guess this is because configure is not called for credentials-relation-joined hook, so at the end of the credentials-relation-joined hook, collect_status is run, and at this point the charm has the keystone data, but the service hasn't been started yet. The service will start in the next credentials-relation-configure hook.

samuelallan72 commented 2 months ago

I updated it to avoid false positives with raising the error:

samuelallan72 commented 2 months ago

Not sure what's going on with the integration tests. :/

samuelallan72 commented 2 months ago

Integration tests are failing because the upgrade hook is crashing on collect status RuntimeError: charmed-openstack-exporter snap service is not active, but it should be.

It's making too many false positives for error state here. If it was blocked, then this would result in a transient blocked state while the exporter wasn't running during an upgrade.