Add context-based logging (pkg/dlog, pkg/logexec)

Background:

Back when supervisor was first getting landed, the logging was agreed-upon to be "MVP, we don't like this shape of logging, but we need it for the rest of supervisor"
On this last Thursday (Aug 15), @rhs commented that now that there's some experience around supervisor, we should sit down and evaluate the experiment.
Meanwhile, in apro.git amb-sidecar, we've had a context-based structured logging system that I've been pretty happy with.

So as part of my contribution to the discussion around supervisor and what we'd like to change, I'd like to submit pkg/dlog and pkg/logexec, which aim to take over the logging functionality of pkg/supervisor.

Port over the logging system from amb-sidecar as pkg/dlog.
Write pkg/logexec around pkg/dlog, as a drop-in replacement for os/exec, and a conceptual replacement for pkg/supervisor.Cmd. I copied in in the tests from /usr/lib/go/src/os/exec/*_test.go, so I am quite confident that it behaves as expected as a drop-in replacement for os/exec. I ported over the logging tests from pkg/supervisor.Command, so I am reasonably confident that meets the logging requirements of pkg/supervisor.Command.

This PR does not adjust anything to use pkg/dlog or pkg/logexec.

I guess I understand the dlog part. An example of using that with Supervisor would be helpful.

I think the logexec stuff is also probably fine, though I'm not happy with the logging format. That's a pretty complex changeset, even if it mostly comes from the standard library. I think I would want to play with it separately before really committing to using it. That said, I suppose you're not removing anything yet, so landing this would be a way to play. Again, an example of using this with Supervisor would be helpful.

I agree it would have been good to include an example of using dlog with Supervisor, but that's WIP.

The logging format of logexec is intentionally very similar to supervisor.Cmd, but with a few (IMO) improvements:

Include a [pid:XXX] prefix, so it's possible to tell which of multiple concurrent commands it's coming from
Say "[ X bytes of binary data]" for lines that aren't valid UTF-8
Truncate very long lines (with a "… (X runes truncated)" suffix). I added this one after one of the os/exec tests prints a line of 40k-long line of "x" repeated.

Just a few initial comments so far...

Given our discussion last week, I think dexec might be a better name than logexec. My rationale is based on our conversation last week about how how one of the capabilities supervisor provides is the ability to initiated a controlled shutdown, and the fact that we could leverage that to centrally handle (in a graceful way) a number of boilerplate error conditions (e.g. the binary you are trying to exec is not in your path, or it's not an executable file).

For the logexec stuff I'd also like to understand/discuss the big picture design/direction a bit more. I'm assuming the goal is to decompose some of the functionality that is built into supervisor in a way that has some benefits. I think it would be good to write down what those benefits are so we can evaluate the before/after effect on our current use cases.

To be clear I think a bit of decomposition/rejiggering is probably a good thing relative to where we are now, but what I'm a bit fuzzier on is how much and how to handle certain elements of what supervisor does with it split up into parts.

I was about 50/50 torn between naming it dexec or logexec (or maybe 45/45/10, with the 10 being dlogexec :-P ). I ended up deciding to name it something suggestive of what it added over os/exec (dlog-based logging). But I'm not attached to that decision.

The big-picture direction I wanted for logexec was:

Decouple it from supervisor.
Have it be a drop-in replacement for os/exec, so that programmers onboarding to it don't need to learn a new API.

Regarding decoupling it from supervisor:

The only tie that supervisor.Cmd has to supervisor is that it uses supervisor's logging. Assuming that coupling goes away because of dlog, it seems silly to me to keep something tightly coupled to supervisor because we might add integrations between them in the future.
We've (mostly) agreed that that being able to centrally handle a class of error conditions is useful, but have been unable thus far to figure out what it could look like. IMO, the solution involves everything having to live in pkg/supervisor, then it's a non-solution. The direction I see that headed is that pkg/supervisor would end up growing a wrapper for every single library we want to use.[^1] If we want to be able to centralize error handling in supervisor, we need to figure out how to make it pluggable.[^2] For that reason, I don't have a problem moving .Cmd out of it; if we want to add integration between .Cmd and supervisor in the future, it shouldn't require having .Cmd be part of supervisor.

A week or two ago, you (@rhs) said something along the lines of "[build-aux] has been allowed to grow complexity, and while it wasn't obvious at the time, I think it's now clear that we've let that go too far, and need to trim it back." I'm not 100% sure I agree with that, but the way you said it (which I've probably done a poor job of capturing) very much describes how I'm feeling about supervisor. Or maybe I should say that: the way it made me think that you feel about build-aux is the way I feel about supervisor?

There's a chunk of functionality I'm ok putting in supervisor: sync/Waitgroup / golang.org/x/sync/errgroup.Group / github.com/oklog/run.Group / github.com/thejerf/suture.Supervisor -type functionality. When I see other things too far removed from that core idea being tacked on to it, I have a viscerally negative reaction (maybe justified, maybe not). If there were code to run an external process as a supervisor.Process that would be one thing, but that's not what supervisor.Command is trying to be. That maybe came out more negative than I wanted.

[^1]: Or maybe the answer is "let's try it as an experiment, and if we go too far down the line of adding a million wrappers, then we re-evaluate the experiment." But like, my sense of design is that that isn't the way to go.

[^2]: Writing that, I just came up with what might be a reasonable approach based on Go 1.13 error wrapping. We should talk about that later.

Regarding this details of PR: I realized that I didn't justify in the docs why it doesn't log stdin/stderr/stdout if it is an *os.File. The short version is that it's impossible to do robustly (at least for the case of stdin, it occurs to me that the proof I gave myself doesn't nescessarily apply to stdout/stderr) even if we're just looking at it from a regular read/write I/O perspective (pretending to a moment that there's no reason we'd ever want ioctls on stdin/out/err to behave normally).

datawire / teleproxy

Add context-based logging (pkg/dlog, pkg/logexec) #160