canonical / pebble

Pebble is a lightweight Linux service manager with layered configuration and an HTTP API.
https://canonical-pebble.readthedocs-hosted.com/
GNU General Public License v3.0
146 stars 54 forks source link

State engine Ensure() errors are only logged, not handled #445

Open thp-canonical opened 3 months ago

thp-canonical commented 3 months ago

Right now in Overlord.Loop(), the error return of the state engine's Ensure() function are ignored:

https://github.com/canonical/pebble/blob/35e2b7cf6ce5be1c3f8b803b281ad8697e8c14a5/internals/overlord/overlord.go#L352-L354

Meanwhile, snapd handles errors - at least in the pressed case - in Ensure():

https://github.com/snapcore/snapd/blob/c59a5f6e87fe59596b14614bc422e0c3a132ca25/overlord/overlord.go#L463-L473

This has been implemented in snapd for preseeding: https://github.com/snapcore/snapd/pull/8190

As other components of Pebble might have the same issue (retrying Ensure() forever not resulting in a change), and implementers of managers might believe that an error returned from Ensure() will be handled by the caller, it might make sense to at least allow an optional error handler to handle Ensure() errors in the overlord loop.

Found with @flotter during code review.