redhat-cop / patch-operator

An operator to apply patches to Kubernetes objects in a declarative way.
Apache License 2.0
105 stars 20 forks source link

Bug in Creation-time patch injection when trying to patch machinesets #58

Open callum-stakater opened 10 months ago

callum-stakater commented 10 months ago

Attempting to deploy a MachineSet via openshift-gitops (issue is also visible when oc apply so isnt gitops related) and using patch-operator as a way to dynamically inject the infrastructure_id via the following patch

apiVersion: machine.openshift.io/v1beta1
kind: MachineSet
metadata:
  name: test-1
  namespace: openshift-machine-api
  annotations:
    "redhat-cop.redhat.io/patch": |
      spec:
        selector:
          matchLabels:
            machine.openshift.io/cluster-api-cluster: {{ (lookup "config.openshift.io/v1" "Infrastructure" "" "cluster").status.infrastructureName }}

spec:
  replicas: 0
  selector:
    matchLabels:
      machine.openshift.io/cluster-api-cluster: REPLACEME
      machine.openshift.io/cluster-api-machineset: REPLACEME-tester-0
  template:
    metadata:
      labels:
        machine.openshift.io/cluster-api-cluster: REPLACEME
        machine.openshift.io/cluster-api-machine-role: tester
        machine.openshift.io/cluster-api-machine-type: tester
        machine.openshift.io/cluster-api-machineset: REPLACEME-tester-0
    spec:
      lifecycleHooks: {}
      metadata: {}
      providerSpec:
        value:
          cloudName: openstack
          networks:
            - filter: {}
              subnets:
                - filter:
                    name: REPLACEME-nodes
                    tags: openshiftClusterID=REPLACEME
          userDataSecret:
            name: tester-user-data
          serverGroupName: REPLACEME-tester
          cloudsSecret:
            name: openstack-cloud-credentials
            namespace: openshift-machine-api
          metadata:
            creationTimestamp: null
          serverMetadata:
            Name: REPLACEME-tester
            openshiftClusterID: REPLACEME
          securityGroups:
            - filter: {}
              name: REPLACEME-tester
          kind: OpenstackProviderSpec
          tags:
            - openshiftClusterID=REPLACEME
          image: ''
          rootVolume:
            deviceType: ''
            diskSize: 120
            sourceType: image
            sourceUUID: REPLACEME-rhcos
            volumeType: __DEFAULT__
          apiVersion: openstackproviderconfig.openshift.io/v1alpha1
          flavor: hm.8x24

results in an error: Error "failed calling webhook "patch-operator-inject.redhatcop.redhat.io": failed to call webhook: Post "https://patch-operator-controller-manager-service.patch-operator.svc:443/inject?timeout=10s": EOF" for field "undefined".

and the following stack trace in the pod logs:

2023/08/11 17:12:39 http: panic serving 10.128.0.2:36988: runtime error: invalid memory address or nil pointer dereference
goroutine 438 [running]:
net/http.(*conn).serve.func1()
    /opt/hostedtoolcache/go/1.18.5/x64/src/net/http/server.go:1825 +0xbf
panic({0x15af8e0, 0x24eb550})
    /opt/hostedtoolcache/go/1.18.5/x64/src/runtime/panic.go:844 +0x258
k8s.io/apimachinery/pkg/util/strategicpatch.mergeMap(0xc00165fe90, 0xc00165fe90?, {0x0, 0x0}, {0x80?, 0x64?})
    /home/runner/go/pkg/mod/k8s.io/apimachinery@v0.24.2/pkg/util/strategicpatch/patch.go:1355 +0x751
k8s.io/apimachinery/pkg/util/strategicpatch.mergeMapHandler({0x15a1c60?, 0xc00165fe90?}, {0x15a1c60?, 0xc0058b0330?}, {0x0, 0x0}, {0x0, 0x0}, {0x1, 0x1})
    /home/runner/go/pkg/mod/k8s.io/apimachinery@v0.24.2/pkg/util/strategicpatch/patch.go:1413 +0x8a
k8s.io/apimachinery/pkg/util/strategicpatch.mergeMap(0xc00165f320, 0xc001652180?, {0x19d9ab0, 0xc00164ca50}, {0xc8?, 0x38?})
    /home/runner/go/pkg/mod/k8s.io/apimachinery@v0.24.2/pkg/util/strategicpatch/patch.go:1363 +0x8a5
k8s.io/apimachinery/pkg/util/strategicpatch.StrategicMergeMapPatchUsingLookupPatchMeta(0xc001652180?, 0xc00575b5b5?, {0x19d9ab0?, 0xc00164ca50?})
    /home/runner/go/pkg/mod/k8s.io/apimachinery@v0.24.2/pkg/util/strategicpatch/patch.go:880 +0x2b
k8s.io/apimachinery/pkg/util/strategicpatch.StrategicMergePatchUsingLookupPatchMeta({0xc000520400?, 0x19d9f80?, 0xc00575dc20?}, {0xc001652180, 0x5f, 0x60}, {0x19d9ab0, 0xc00164ca50})
    /home/runner/go/pkg/mod/k8s.io/apimachinery@v0.24.2/pkg/util/strategicpatch/patch.go:831 +0x85
github.com/redhat-cop/patch-operator/controllers.(*PatchInjector).Handle(_, {_, _}, {{{0xc0056b5290, 0x24}, {{0xc005689e60, 0x14}, {0xc00582aaa8, 0x7}, {0xc00582aac0, ...}}, ...}})
    /home/runner/work/patch-operator/patch-operator/controllers/inject_webhook.go:129 +0xe05
sigs.k8s.io/controller-runtime/pkg/webhook/admission.(*Webhook).Handle(_, {_, _}, {{{0xc0056b5290, 0x24}, {{0xc005689e60, 0x14}, {0xc00582aaa8, 0x7}, {0xc00582aac0, ...}}, ...}})
    /home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.2/pkg/webhook/admission/webhook.go:146 +0xa2
sigs.k8s.io/controller-runtime/pkg/webhook/admission.(*Webhook).ServeHTTP(0xc0056fed40, {0x7f03507f65d0?, 0xc00573e960}, 0xc00570ef00)
    /home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.2/pkg/webhook/admission/http.go:98 +0xe90
github.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerInFlight.func1({0x7f03507f65d0, 0xc00573e960}, 0x19d9300?)
    /home/runner/go/pkg/mod/github.com/prometheus/client_golang@v1.12.1/prometheus/promhttp/instrument_server.go:40 +0xd4
net/http.HandlerFunc.ServeHTTP(0x19d93c0?, {0x7f03507f65d0?, 0xc00573e960?}, 0xc00574da38?)
    /opt/hostedtoolcache/go/1.18.5/x64/src/net/http/server.go:2084 +0x2f
github.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerCounter.func1({0x19d93c0?, 0xc0001948c0?}, 0xc00570ef00)
    /home/runner/go/pkg/mod/github.com/prometheus/client_golang@v1.12.1/prometheus/promhttp/instrument_server.go:117 +0xaa
net/http.HandlerFunc.ServeHTTP(0x2511680?, {0x19d93c0?, 0xc0001948c0?}, 0xc00574d9c0?)
    /opt/hostedtoolcache/go/1.18.5/x64/src/net/http/server.go:2084 +0x2f
github.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerDuration.func2({0x19d93c0, 0xc0001948c0}, 0xc00570ef00)
    /home/runner/go/pkg/mod/github.com/prometheus/client_golang@v1.12.1/prometheus/promhttp/instrument_server.go:84 +0xbf
net/http.HandlerFunc.ServeHTTP(0x40d0071d0?, {0x19d93c0?, 0xc0001948c0?}, 0xc00576ad30?)
    /opt/hostedtoolcache/go/1.18.5/x64/src/net/http/server.go:2084 +0x2f
net/http.(*ServeMux).ServeHTTP(0xc0056b526d?, {0x19d93c0, 0xc0001948c0}, 0xc00570ef00)
    /opt/hostedtoolcache/go/1.18.5/x64/src/net/http/server.go:2462 +0x149
net/http.serverHandler.ServeHTTP({0x19cce40?}, {0x19d93c0, 0xc0001948c0}, 0xc00570ef00)
    /opt/hostedtoolcache/go/1.18.5/x64/src/net/http/server.go:2916 +0x43b
net/http.(*conn).serve(0xc0057aa280, {0x19d9f80, 0xc0007b03c0})
    /opt/hostedtoolcache/go/1.18.5/x64/src/net/http/server.go:1966 +0x5d7
created by net/http.(*Server).Serve
    /opt/hostedtoolcache/go/1.18.5/x64/src/net/http/server.go:3071 +0x4db
I0811 17:12:43.586998       1 leaderelection.go:258] successfully acquired lease patch-operator/8dcd3a3f.redhat.io

In general patch-operator is functional and machinesets in general can be patched for example this works:

apiVersion: machine.openshift.io/v1beta1
kind: MachineSet
metadata:
  name: test-1
  namespace: openshift-machine-api
  annotations:
    "redhat-cop.redhat.io/patch": |
      spec:
        replicas: 1

spec:
  replicas: 0
  selector:
    matchLabels:
      machine.openshift.io/cluster-api-cluster: REPLACEME
      machine.openshift.io/cluster-api-machineset: REPLACEME-tester-0
  template:
    metadata:
      labels:
        machine.openshift.io/cluster-api-cluster: REPLACEME
        machine.openshift.io/cluster-api-machine-role: tester
        machine.openshift.io/cluster-api-machine-type: tester
        machine.openshift.io/cluster-api-machineset: REPLACEME-tester-0
    spec:
      lifecycleHooks: {}
      metadata: {}
      providerSpec:
        value:
          cloudName: openstack
          networks:
            - filter: {}
              subnets:
                - filter:
                    name: REPLACEME-nodes
                    tags: openshiftClusterID=REPLACEME
          userDataSecret:
            name: tester-user-data
          serverGroupName: REPLACEME-tester
          cloudsSecret:
            name: openstack-cloud-credentials
            namespace: openshift-machine-api
          metadata:
            creationTimestamp: null
          serverMetadata:
            Name: REPLACEME-tester
            openshiftClusterID: REPLACEME
          securityGroups:
            - filter: {}
              name: REPLACEME-tester
          kind: OpenstackProviderSpec
          tags:
            - openshiftClusterID=REPLACEME
          image: ''
          rootVolume:
            deviceType: ''
            diskSize: 120
            sourceType: image
            sourceUUID: REPLACEME-rhcos
            volumeType: __DEFAULT__
          apiVersion: openstackproviderconfig.openshift.io/v1alpha1
          flavor: hm.8x24

the example is attempting to create machineset on openstack provider but the same was experienced on vsphere also so isnt provider specific

callum-stakater commented 10 months ago

OK I found a way around it using jsonpatch instead which is maybe arguably a bit nicer anyway

apiVersion: machine.openshift.io/v1beta1
kind: MachineSet
metadata:
  name: test-1
  namespace: openshift-machine-api
  annotations:
    "redhat-cop.redhat.io/patch-type": "application/json-patch+json"
    "redhat-cop.redhat.io/patch": |
      - op: replace
        path: /spec/selector/matchLabels/machine.openshift.io~1cluster-api-cluster
        value: {{ (lookup "config.openshift.io/v1" "Infrastructure" "" "cluster").status.infrastructureName }}
      - op: replace
        path: /spec/selector/matchLabels/machine.openshift.io~1cluster-api-machineset
        value: {{ (lookup "config.openshift.io/v1" "Infrastructure" "" "cluster").status.infrastructureName }}-tester-0
      - op: replace
        path: /spec/template/metadata/labels/machine.openshift.io~1cluster-api-cluster
        value: {{ (lookup "config.openshift.io/v1" "Infrastructure" "" "cluster").status.infrastructureName }}
      - op: replace
        path: /spec/template/metadata/labels/machine.openshift.io~1cluster-api-machineset
        value: {{ (lookup "config.openshift.io/v1" "Infrastructure" "" "cluster").status.infrastructureName }}-tester-0
      - op: replace
        path: /spec/template/spec/providerSpec/value/networks/0/subnets/0/filter/name
        value: {{ (lookup "config.openshift.io/v1" "Infrastructure" "" "cluster").status.infrastructureName }}-nodes
      - op: replace
        path: /spec/template/spec/providerSpec/value/networks/0/subnets/0/filter/tags
        value: openshiftClusterID= {{ (lookup "config.openshift.io/v1" "Infrastructure" "" "cluster").status.infrastructureName }}
      - op: replace
        path: /spec/template/spec/providerSpec/value/userDataSecret/name
        value: {{ (lookup "config.openshift.io/v1" "Infrastructure" "" "cluster").status.infrastructureName }}-user-data
      - op: replace
        path: /spec/template/spec/providerSpec/value/serverGroupName
        value: {{ (lookup "config.openshift.io/v1" "Infrastructure" "" "cluster").status.infrastructureName }}-tester
      - op: replace
        path: /spec/template/spec/providerSpec/value/metadata/serverMetadata/openshiftClusterID
        value: {{ (lookup "config.openshift.io/v1" "Infrastructure" "" "cluster").status.infrastructureName }}
      - op: replace
        path: /spec/template/spec/providerSpec/value/securityGroups/0/filter/name
        value: {{ (lookup "config.openshift.io/v1" "Infrastructure" "" "cluster").status.infrastructureName }}-tester
      - op: replace
        path: /spec/template/spec/providerSpec/value/tags/0
        value: openshiftClusterID= {{ (lookup "config.openshift.io/v1" "Infrastructure" "" "cluster").status.infrastructureName }}
      - op: replace
        path: /spec/template/spec/providerSpec/value/rootVolume/sourceUUID
        value: {{ (lookup "config.openshift.io/v1" "Infrastructure" "" "cluster").status.infrastructureName }}-rhcos

spec:
  replicas: 0
  selector:
    matchLabels:
      machine.openshift.io/cluster-api-cluster: REPLACEME
      machine.openshift.io/cluster-api-machineset: REPLACEME-tester-0
  template:
    metadata:
      labels:
        machine.openshift.io/cluster-api-cluster: REPLACEME
        machine.openshift.io/cluster-api-machine-role: tester
        machine.openshift.io/cluster-api-machine-type: tester
        machine.openshift.io/cluster-api-machineset: REPLACEME-tester-0
    spec:
      lifecycleHooks: {}
      metadata: {}
      providerSpec:
        value:
          cloudName: openstack
          networks:
            - filter: {}
              subnets:
                - filter:
                    name: REPLACEME-nodes
                    tags: openshiftClusterID=REPLACEME
          userDataSecret:
            name: tester-user-data
          serverGroupName: REPLACEME-tester
          cloudsSecret:
            name: openstack-cloud-credentials
            namespace: openshift-machine-api
          metadata:
            creationTimestamp: null
          serverMetadata:
            Name: REPLACEME-tester
            openshiftClusterID: REPLACEME
          securityGroups:
            - filter: {}
              name: REPLACEME-tester
          kind: OpenstackProviderSpec
          tags:
            - openshiftClusterID=REPLACEME
          image: ''
          rootVolume:
            deviceType: ''
            diskSize: 120
            sourceType: image
            sourceUUID: REPLACEME-rhcos
            volumeType: __DEFAULT__
          apiVersion: openstackproviderconfig.openshift.io/v1alpha1
          flavor: hm.8x24
pabrahamsson commented 7 months ago

I can confirm the same issue & workaround on Azure