k8snetworkplumbingwg / ovs-cni

Open vSwitch CNI plugin
Apache License 2.0
218 stars 70 forks source link

Host local IPAM configured with erroneous NAD results in stale ovs ports #258

Closed vinay50muddu closed 1 year ago

vinay50muddu commented 1 year ago

When using ovs-cni with host-local IPAM, if there is any invalid configuration specific to IPAM in the NAD, ovs port is created but IPAM rejects the configuration and throws an error. The ovs port is not cleared. This results in multiple stale ports over a period of time. How to reproduce:

  1. Create a NAD with the following content.

    apiVersion: "k8s.cni.cncf.io/v1"
    kind: NetworkAttachmentDefinition
    metadata:
    name: ovs-green-bad-net
    spec:
    config: '{
      "cniVersion": "0.3.1",
      "type": "ovs",
      "bridge": "br_vin",
      "mtu": 9000,
      "vlan": 1029,
      "ipam": {
          "type": "host-local",
          "ranges": [
              [ {
                   "subnet": "172.100.29.0/24",
                   "rangeStart": "172.100.29.100",
                   "rangeEnd": "172.100.100.254",
                   "gateway": "172.100.100.1"
              } ]
          ]
      }
    }'
  2. create a sample deployment that uses this NAD.

    apiVersion: apps/v1
    kind: Deployment
    metadata:
    labels:
    app: ovs-hello-app-dep
    name: ovs-hello-app-dep
    spec:
    replicas: 1
    selector:
    matchLabels:
      app: ovs-hello-app
    strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
    template:
    metadata:
      labels:
        app: ovs-hello-app
      annotations:
        k8s.v1.cni.cncf.io/networks: ovs-green-bad-net
    spec:
      containers:
      - image: gcr.io/google-samples/hello-app:1.0
        imagePullPolicy: IfNotPresent
        name: hello-app-container

We see error messages in the pod log and ports are created which are never removed.

phoracek commented 1 year ago

Hello @vinay50muddu, thanks for reporting this. I have drafted a fix under https://github.com/k8snetworkplumbingwg/ovs-cni/pull/259

vinay50muddu commented 1 year ago

@phoracek thanks for fixing this. One query i have w.r.t the fix. If cmdDel of any IPAM fails, will the fix also takes care of deleting the created OVS ports? I think the issue to be fixed is any IPAM failure should be handled gracefully instead of terminating abruptly and returning.

phoracek commented 1 year ago

I think it cleans up after any IPAM problem. Is it not?