hashicorp / go-plugin

Golang plugin system over RPC.
Mozilla Public License 2.0
5.25k stars 450 forks source link

gRPC stdout/stderr syncing support #135

Closed mitchellh closed 4 years ago

mitchellh commented 4 years ago

Fixes #93

This is a feature that has existed in go-plugin for a few years by setting SyncStdout, SyncStderr on ClientConfig, but this feature only worked with net/rpc plugins previously.

This feature will sync the os.Stdout/Stderr output from the plugin to the specified writers. If no writers are specified, it drops the data. Note that the os.Stdout/Stderr that is synced is the overridden writers that you can see set in server.go around line 250 at the time of this commit.

The use case for this feature is that a plugin that subprocesses for example can connect stdout/stderr and the output of the subprocess will reach the target synced stream (which may just be the host process stdout/stderr).

Backwards Compatibility

Should be preserved. In grpc_client.go we noop the stdio syncing if we get a codes.Unavailable when trying to connect to the stdio service. This will make existing plugins that don't serve this service work fine.

Implementation Notes

This is implemented via a fairly naive "stdio" gRPC service. We chunk the data at 1KB boundaries currently because in practice stdout/stderr is pretty quiet. We don't expect so much data ever that this will be a problem. If it is, we can change this.

There are various improvements we could make but I don't think matter: right now even if syncing is disabled, all plugins will still transport all stdout/stderr to the host process. Most plugins don't have ANY stdout/stderr so this shouldn't matter. I didn't do this because net/rpc doesn't do this and I mimicked that implementation.

mitchellh commented 4 years ago

And thanks to #113. This is very similar and inspired by the impl., I changed it a bit and added tests.

yurishkuro commented 4 years ago

@mitchellh could you please help my understand (also this https://github.com/hashicorp/go-plugin/pull/113#issuecomment-586667462) why GRPC needs to be involved in shipping stdout/err back to parent process, instead of using Cmd.StdoutPipe/StderrPipe?

mitchellh commented 4 years ago

@mitchellh could you please help my understand (also this #113 (comment)) why GRPC needs to be involved in shipping stdout/err back to parent process, instead of using Cmd.StdoutPipe/StderrPipe?

Yeah sure! We use the real process stdout/stderr (i'll call them FD1 and FD2 so we don't get mixed up with the "synced" stdout/stderr) for internal purposes:

So what we do instead is overwrite os.Stdout/Stderr in plugin.Serve to go to the pipes that get streamed back -- in this case over a gRPC service and in net/rpc's case a Yamux stream.

Is it possible to repurpose FD1/FD2 for stdout/stderr syncing? Probably, but I think it would introduce a lot more complexity and risk destabilizing our plugin system much more than this approach.

yurishkuro commented 4 years ago

Thanks, makes sense now. Agree that it's not worth complicating the handshake protocol at this point (although if plugins log a lot, it probably would've been more efficient to reuse stdio streams).

FD2 (stderr) is used as the log output. Anything on FD2 is re-logged in the host process logger instance.

I guess the issues we've seen some people having with non-Go implementation of plugins for Jaeger were due to them writing logs to stdout, not stderr.

mitchellh commented 4 years ago

Addressed feedback.

mitchellh commented 4 years ago

Whoops sorry @paultyng I re-requested your review but meant @jbardin. :)

marefr commented 4 years ago

@mitchellh regarding backwards compatibility I don't agree that works as expected/explained above. Already merged #139 plus a new one opened #142, but those are more patching the issues. Maybe we need to rethink what error codes to check for identifying old plugin?

Have you been able to reproduce the behavior of getting back an Unavailable error here or here

Let me know if rather would like me to open an issue about it.

mitchellh commented 4 years ago

I did test Unavailable prior to merge. If you remove that check, old tests will fail. They probably won't anymore since we have some additional checks now but that was the first one that went in.