Ip got leaked when there is a etcd store network connection problem

jmf0526 commented 4 years ago

Expected Behavior

Calico ipam can tolerate network fault when updating ip block to etcd store, assigned ip should never got leaked (can be released)

Current Behavior

I'm using annotation cni.projectcalico.org/ipAddrs to assign static ip to a pod, like

Name: licc2-6755b8ffbd-tpdxl Namespace: base-service Priority: 0 PriorityClassName: Node: ksdlsdn02/172.16.1.2 Start Time: Wed, 23 Sep 2020 18:52:53 +0800 Labels: app=app-licc2 pod-template-hash=2311649968 Annotations: cni.projectcalico.org/ipAddrs=["177.177.198.2"] k8s.v1.cni.cncf.io/networks-status=

everything worked fine, and after the env had been running stably for a long time, the pod got evicted ( because of disk shortage on that node), the new pod stuck in ContainerCreating status, 'kubectl describe po' suggests that the specified ip is already assigned:

error getting IP from IPAM: Address already assigned in block

Possible Solution

I checked calico etcd data, ip 1771.177.198.2 is occupied by containerdfd42c6f630ec6d6e9fc6597d981ede59c5df245cebc0a27c676adeb4f9b15556, pod licc2-6755b8ffbd-89v67, etcd ip block :

/calico/ipam/v2/assignment/ipv4/block/177.177.198.0-26 {"cidr":"177.177.198.0/26","affinity":"","strictAffinity":false,"allocations":[0,1,3,2,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null],"unallocated":[4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63],"attributes":[{"handle_id":null,"secondary":null},{"handle_id":"k8s-pod-network.d3f953922b58df373d12fff02c64ee46a960945eb75bf76404bade008ffc086d","secondary":{"namespace":"base-service","node":"ksdlsdn01","pod":"licc1-7564dffd8f-4x7h8"}},{"handle_id":"k8s-pod-network.60d4c0f645df9e356d38361c8d8cb15ccc0cf26af6e11d61b2cb7316bccd6b39","secondary":{"namespace":"base-service","node":"ksdlsdn03","pod":"licc3-84564d456f-kcvx4"}},{"handle_id":"k8s-pod-network.fd42c6f630ec6d6e9fc6597d981ede59c5df245cebc0a27c676adeb4f9b15556","secondary":{"namespace":"base-service","node":"ksdlsdn02","pod":"licc2-6755b8ffbd-89v67"}}],"deleted":false}

but there is no corresponding ipam handle for k8s-pod-network .fd42c6f630ec6d6e9fc6597d981ede59c5df245cebc0a27c676adeb4f9b15556 in etcd.

The following log was printed in kubelet:

Jul 24 20:42:00 localhost kubelet: 2020-07-24 20:42:00.210 [INFO][298868] ipam.go 1263: Creating new handle: k8s-pod-network.fd42c6f630ec6d6e9fc6597d981ede59c5df245cebc0a27c676adeb4f9b15556

Jul 24 20:43:10 localhost kubelet: 2020-07-24 20:43:10.207 [WARNING][298868] etcdv3.go 204: Update failed error=rpc error: code = Unavailable desc = transport is closing etcdv3-etcdKey="/calico/ipam/v2/assignment/ipv4/block/177.177.198.0-26" model-etcdKey=BlockKey(cidr=177.177.198.0/26) rev="27359630" ttl=0s value=&model.AllocationBlock{CIDR:net.IPNet{IPNet:net.IPNet{IP:net.IP{0xb1, 0xb1, 0xc6, 0x0}, Mask:net.IPMask{0xff, 0xff, 0xff, 0xc0}}}, Affinity:(string)(0xc00047c0c0), StrictAffinity:false, Allocations:[]int{(int)(0xc000434d90), (int)(0xc000434d98), (int)(0xc000434ed0), (int)(0xc000434da0), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil), (int)(nil)}, Unallocated:[]int{4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63}, Attributes:[]model.AllocationAttribute{model.AllocationAttribute{AttrPrimary:(string)(nil), AttrSecondary:map[string]string(nil)}, model.AllocationAttribute{AttrPrimary:(string)(0xc00047c110), AttrSecondary:map[string]string{"namespace":"base-service", "node":"ksdlsdn01", "pod":"licc1-7564dffd8f-4x7h8"}}, model.AllocationAttribute{AttrPrimary:(string)(0xc00047c1d0), AttrSecondary:map[string]string{"namespace":"base-service", "node":"ksdlsdn03", "pod":"licc3-84564d456f-kcvx4"}}, model.AllocationAttribute{AttrPrimary:(string)(0xc0006d4180), AttrSecondary:map[string]string{"node":"ksdlsdn02", "pod":"licc2-6755b8ffbd-89v67", "namespace":"base-service"}}}, Deleted:false, HostAffinity:(*string)(nil)}

Jul 24 20:43:10 localhost kubelet: 2020-07-24 20:43:10.207 [WARNING][298868] ipam.go 654: Update failed on block 177.177.198.0/26 error=rpc error: code = Unavailable desc = transport is closing

" transport is closing" suggests that there is network problem when updating ip block in etcdv3 store, but data actually got saved successfully.

The following log shows that calico-ipam didn't release the ip because it can't find the corresponding ipam hanle

Jul 24 21:00:35 localhost kubelet: 2020-07-24 21:00:35.321 [WARNING][323525] ipam_plugin.go 306: Asked to release address but it doesn't exist. Ignoring ContainerID="fd42c6f630ec6d6e9fc6597d981ede59c5df245cebc0a27c676adeb4f9b15556" HandleID="k8s-pod-network.fd42c6f630ec6d6e9fc6597d981ede59c5df245cebc0a27c676adeb4f9b15556" Workload="ksdlsdn02-k8s-licc2--6755b8ffbd--89v67-eth0"

After check the AssignIP function in ipadm.go, I think the reason for this problem is that when c.blockReaderWriter.updateBlock returns error, ipam handle would got deleted, but in this scenario ip block actually got updated. So when ipam trying to release ip, it can't find corresponding ip handle, then assigned ip would never got released.

https://github.com/projectcalico/libcalico-go/blob/d414dc7c1c754fd3fabf58d769e4c426df3c106f/lib/ipam/ipam.go#L795

                // Increment handle.
        if args.HandleID != nil {
            c.incrementHandle(ctx, *args.HandleID, blockCIDR, 1, args.Attrs)
        }

        // Update the block using the original KVPair to do a CAS.  No need to
        // update the Value since we have been manipulating the Value pointed to
        // in the KVPair.
        _, err = c.blockReaderWriter.updateBlock(ctx, obj)
        if err != nil {
            if _, ok := err.(cerrors.ErrorResourceUpdateConflict); ok {
                log.WithError(err).Debug("CAS error assigning IP - retry")
                continue
            }

            log.WithError(err).Warningf("Update failed on block %s", block.CIDR.String())
            if args.HandleID != nil {
                if err := c.decrementHandle(ctx, *args.HandleID, blockCIDR, 1); err != nil {
                    log.WithError(err).Warn("Failed to decrement handle")
                }
            }
            return err
        }
        return nil

Steps to Reproduce (for bugs)

I don't know how to reproduce this stably, but ip leakage can happen if you delete ipam handle manually.

Context

Your Environment

libcalico-go version: calico cni3.7.4
Operating System and version: CentO 7.6
Link to your project (optional):

caseydavenport commented 3 years ago

@jmf0526 this is an interesting one, sorry for missing it for so long!

It sounds like what you're suggesting is that etcd has returned an error to Calico even though it successfully handled the request? As a result, we think it failed and attempt to clean up, leaving ourselves in an unexpected state where the allocation exists but the handle has been removed. Is that right?

If so I'll need to think about if there's anything we can do here. I don't know if we can just handle transport errors separately since we don't know know for sure if the request succeeded or not.

If you do hit this and want to get past it, you can clean up the allocation by hand using calicoctl ipam

jmf0526 commented 3 years ago

@jmf0526 this is an interesting one, sorry for missing it for so long!

It sounds like what you're suggesting is that etcd has returned an error to Calico even though it successfully handled the request? As a result, we think it failed and attempt to clean up, leaving ourselves in an unexpected state where the allocation exists but the handle has been removed. Is that right?

Yes, that's exactly what I was trying to say.

If so I'll need to think about if there's anything we can do here. I don't know if we can just handle transport errors separately since we don't know know for sure if the request succeeded or not.

If you do hit this and want to get past it, you can clean up the allocation by hand using calicoctl ipam

I also feel this can't be handled easily, we have already recovered the env by using calicoctl ipam, thanks for the tip.

projectcalico / calico