canonical / pebble

Pebble is a lightweight Linux service manager with layered configuration and an HTTP API.
https://canonical-pebble.readthedocs-hosted.com/
GNU General Public License v3.0
145 stars 54 forks source link

perf(daemon): avoid acquiring state lock (twice!) for every API request #435

Closed benhoyt closed 2 months ago

benhoyt commented 3 months ago

In daemon.Command.ServeHTTP, at the end of every request we acquire the state lock twice, once for including the maintenance (restart) info in the response, and once for including the warnings summary. We don't use either maintenance/restarts or warnings in Pebble at all, so let's just turn that off.

We could rip out those features completely, but the APIs need to stay for backwards compatibility, and it's possible we'll want them later, so let's just disable them and their tests for now.

Locking and unlocking the state for reading isn't very expensive, but if something else has locked the state for writing (for example, creating/updating a Change or Task), API requests won't be able to acquire the lock till the relatively slow operation of serialising and writing state is finished. Per #366, that can sometimes be over 1s when the system is under load.

Fixes #366

hpidcock commented 3 months ago

@thp-canonical or @flotter could you please check this out.

benhoyt commented 3 months ago

Thanks for the comments, folks, especially your digging into details, @thp-canonical.

Regarding maintenance/restart, using a separate atomic bool or int is a good idea. I'll consider that further.

For warnings, note that they aren't used at all, and probably never should be. They are kind of broken, well, at least they assume only a single client (which isn't the case of Pebble, and probably isn't for the other project either) -- this is because warnings are "okayed" on the server side. Whereas the new notices system does it via an "after" timestamp from the client, so you can have as many clients as you need. Read more here.

So for warnings, I'm going to have a go at removing them more completely. State.Warnf isn't called by Pebble at all (and the other project definitely doesn't), so I think we should just remove it, and then warnings can never get added. I think we can even remove warning-timestamp and warning-count from the response struct, as they're marked omitempty and hence won't ever be present right now anyway. I'll need to keep the /v1/warnings API for backwards compatibility, however. I'm not 100% sure it's doable yet, but I think this is a better route for Pebble.

hpidcock commented 3 months ago

Thanks for the comments, folks, especially your digging into details, @thp-canonical.

Regarding maintenance/restart, using a separate atomic bool or int is a good idea. I'll consider that further.

For warnings, note that they aren't used at all, and probably never should be. They are kind of broken, well, at least they assume only a single client (which isn't the case of Pebble, and probably isn't for the other project either) -- this is because warnings are "okayed" on the server side. Whereas the new notices system does it via an "after" timestamp from the client, so you can have as many clients as you need. Read more here.

So for warnings, I'm going to have a go at removing them more completely. State.Warnf isn't called by Pebble at all (and the other project definitely doesn't), so I think we should just remove it, and then warnings can never get added. I think we can even remove warning-timestamp and warning-count from the response struct, as they're marked omitempty and hence won't ever be present right now anyway. I'll need to keep the /v1/warnings API for backwards compatibility, however. I'm not 100% sure it's doable yet, but I think this is a better route for Pebble.

I'm curious how hard it would be to re-implement on-top of the back of notices.

benhoyt commented 3 months ago

So for warnings, I'm going to have a go at removing them more completely

I'm having a go at this in https://github.com/canonical/pebble/pull/443

benhoyt commented 2 months ago

Going to close this hacky approach in favour of https://github.com/canonical/pebble/pull/451 (using an atomic for restart.Pending) and https://github.com/canonical/pebble/pull/443 (remove warnings completely).