networkservicemesh / sdk-vpp

Apache License 2.0
2 stars 19 forks source link

`sdk-vpp` leaks VPP resources in case of context timeout #378

Open Bolodya1997 opened 3 years ago

Bolodya1997 commented 3 years ago

Expected Behavior

sdk-vpp shouldn't leak any resources.

Current Behavior

sdk-vpp sometimes leaks VPP resources in case of context canceled/timeout:

func Request(ctx context.Context) (conn, err) {
    if err := requestVPP(ctx); err != nil {
        return nil, err // <-- error can be caused by context canceled/timeout
    }
}

It affects almost every sdk-vpp chain element, so it looks like a critical issue.

Failure Logs

forwarder.log

  1. First Request fails with context deadline exceeded in tap chain element.
  2. All subsequent Requests fail with VPPApiError: netlink error (-145) because of already existing kernel interface with the requested name.

Possible solution

We should probably do something like the following in all sdk-vpp chain elements which allocates some VPP resources:

func Request(ctx context.Context) (conn, err) {
    postponeCtxFunc := postpone.Context(ctx)

    conn, err := next.Request()
    if err != nil {
        return nil, err
    }

    addDel := &vppAddDelSomething{...}
    if err := addDelSomething(ctx, addDel); err != nil {
        if errors.Is(err, context.Canceled) || errors.Is(err, context.DeadlineExceeded) {
            delCtx, cancelDel := postponeCtxFunc()
            defer cancelDel()

            addDel.IsAdd = false
            _ = addDelSomething(delCtx, addDel)
        }
        return nil, err
    }

    return conn, nil
}
Bolodya1997 commented 3 years ago

@edwarnicke @denis-tingaikin It seems to be critical, please take a look.

denis-tingaikin commented 3 years ago

I think we can create an helper function that will call cancel if context has been expired during resource allocation.

@edwarnicke Please share your thoughts on this issue.