canonical / pebble

Pebble is a lightweight Linux service manager with layered configuration and an HTTP API.
https://canonical-pebble.readthedocs-hosted.com/
GNU General Public License v3.0
145 stars 54 forks source link

fix(logger): Make MockLogger threadsafe #441

Closed thp-canonical closed 1 month ago

thp-canonical commented 3 months ago

When using a logger.MockLogger(), a background goroutine calling logger.Noticef() might race the underlying bytes.Buffer for read/write access. To avoid that, limit the returned buffer type for MockLogger() to just fmt.Stringer (and mutex-protect its reading from the buffer) as well as mutex-protect Write() calls to the underlying buffer.

To reproduce the race condition, run go test -race ./internals/logger with TestMockLoggerReadWriteThreadsafe added, but with the unmodified MockLogger implementation.

If more functions from bytes.Buffer are needed from the MockLogger return value, the interface can be expanded with additional pass-through lock-protected calls, but in all of Pebble, it seems only the .String() function is used, so only that is exposed for now.

While logger.Noticef() calls are serialized by logger.loggerLock, the same cannot be said for parallel access to the underlying bytes.Buffer that is returned from MockLogger when calling logger.Noticef() in one goroutine, and accessing the bytes.Buffer from another goroutine.

thp-canonical commented 3 months ago

As an additional datapoint, snapd had the same issue, but they solved it in a slightly different way:

https://github.com/snapcore/snapd/commit/16c3680b78d

benhoyt commented 3 months ago

In the abstract I think I prefer the direction you're going, however, I'd like to stay close to snapd for common things where possible. Can we pick their approach here instead? What would the diff (especially test diff) look like with that approach?

thp-canonical commented 3 months ago

In the abstract I think I prefer the direction you're going, however, I'd like to stay close to snapd for common things where possible. Can we pick their approach here instead? What would the diff (especially test diff) look like with that approach?

It's of course possible to add and use WithLoggerLock() instead. However, it's not enforced, so there are many logbuf.String() calls in snapd's testing code that are not protected by WithLoggerLock() (which is fine, as long as no background goroutine is logging), it seems like an opt-in where the caller of logbuf.String() would have to know/anticipate that some background goroutine is potentially accessing logbuf:

https://github.com/search?q=repo%3Asnapcore%2Fsnapd%20logbuf.String&type=code

vs

https://github.com/search?q=repo%3Asnapcore%2Fsnapd+WithLoggerLock&type=code

Trying to log anything in the function passed to WithLoggerLock() would result in a deadlock (as the lock that the logger call is trying to obtain has already been obtained), e.g. this (albeit very, very constructed example):

logger.WithLoggerLock(func () {
    for i := 0; i < 10; i++ {
        logger.Noticef(s.logbuf.String())
    }
})

Will have to be split up like this:

for i := 0; i < 10; i++ {
    var str string
        logger.WithLoggerLock(func () {
                str = s.logbuf.String()
        })
    logger.Noticef(str)
}

Expressions like logger.Noticef(s.logbuf.String()) are not an issue with the proposed approach in this PR, since the function call parameters are evaluated (and the logbuf lock released) before logger.Noticef() is called.

I haven't measured what the overhead of locking and unlocking an uncontested mutex is in cases where there's no background goroutine when accessing logbuf.String().

cc @jhenstridge and @pedronis from the snapd commit https://github.com/snapcore/snapd/commit/16c3680b78d for additional input on the design of WithLoggerLock() that i might be missing -- there's probably a good reason for WithLoggerLock(), and we might indeed want to have the same in Pebble.

thp-canonical commented 1 month ago

If you want to convince the snapd folks yours is better than their opt-in-to-safety and get this merged there, go ahead. :-)

Done here now (slightly different due to snapd's requirement of a .Reset() function):

In case this gets merged in snapd upstream, it might make sense to sync back the changes (logger.MockedLogger interface) to Pebble.