containerd / ttrpc

GRPC for low-memory environments
Apache License 2.0
558 stars 80 forks source link

client:fix ttrpc send hang #173

Closed ningmingxiao closed 3 weeks ago

ningmingxiao commented 2 months ago

fix:https://github.com/containerd/ttrpc/issues/174 @dmcgowan @fuweid @thaJeztah can you review my pr? thank you.

dmcgowan commented 1 month ago

The client context scope should not be considered during send here. Since the context is at the client scope, it would be expected that unblocking sends would be done by closing the underlying connection. The sender could be legitimately waiting on writing to the connection.

What is the caller error that would cause a hang here? The caller has timed out by something is waiting for ttrpc to finish? Seems that should be handled by the caller not ttrpc.

ningmingxiao commented 1 month ago

I use containerd 1.7.3 https://github.com/containerd/containerd/blob/v1.7.3/services/tasks/local.go#L340 containerd try to connect shim to get container status, it hanged 1818 minutes.

ctx timeout is 2s.

func getProcessState(ctx context.Context, p runtime.Process) (*task.Process, error) {
    ctx, cancel := timeout.WithContext(ctx, stateTimeout)
    defer cancel()

    state, err := p.State(ctx)

createStream ctx never timeout. It should check timeout in dispatch func.

func (c *Client) createStream(flags uint8, b []byte) (*stream, error) {
...
    select {
    case <-c.ctx.Done():
// NewClient creates a new ttrpc client using the given connection
func NewClient(conn net.Conn, opts ...ClientOpts) *Client {
    ctx, cancel := context.WithCancel(context.Background())
kevpar commented 3 weeks ago

As I mentioned here, this may be a duplicate of an issue we already fixed.