Kong / kong-operator

Kong Operator for Kubernetes and OpenShift
https://konghq.com
Apache License 2.0
58 stars 27 forks source link

Incorrect image version being deployed via Operator based on values supplied #76

Closed ahuffman closed 2 years ago

ahuffman commented 2 years ago

Reported by Casey Wylie from Red Hat:

When the helm chart is installed from the Operator, extra values are added to the helm chart (unifiedRepoTag), which ultimately leads to the rendered manifests not having the correct image values.

To find out the exact values to use in the operator, I first installed the Kong Gateway (CP) through the helm chart exactly as described in the doc to ensure that I am able to reach the Kong Manager.

After successfully installing through the helm chart and verifying that I am able to reach the Kong Manager, I double checked the values being used in the install.

helm get values -n kong kong

USER-SUPPLIED VALUES:
admin:
  enabled: true
  http:
    enabled: true
  type: NodePort
cluster:
  enabled: true
  tls:
    containerPort: 8005
    enabled: true
    servicePort: 8005
clustertelemetry:
  enabled: true
  tls:
    containerPort: 8006
    enabled: true
    servicePort: 8006
enterprise:
  enabled: true
  license_secret: kong-enterprise-license
  portal:
    enabled: false
  rbac:
    enabled: false
  smtp:
    enabled: false
env:
  cluster_cert: /etc/secrets/kong-cluster-cert/tls.crt
  cluster_cert_key: /etc/secrets/kong-cluster-cert/tls.key
  database: postgres
  role: control_plane
image:
  repository: kong/kong-gateway
  tag: 2.8.0.0-alpine
ingressController:
  enabled: true
  image:
    repository: kong/kubernetes-ingress-controller
    tag: 2.2.1
  installCRDs: false
manager:
  enabled: true
  type: NodePort
postgresql:
  enabled: true
  postgresqlDatabase: kong
  postgresqlPassword: kong
  postgresqlUsername: kong
  securityContext:
    fsGroup: ""
    runAsUser: 1000660000
proxy:
  enabled: true
  type: ClusterIP
secretVolumes:
- kong-cluster-cert

After doing a helm uninstall, ensuring that all pods, and pvc's are cleaned up, I install Kong through the operator using the same values.

apiVersion: [charts.konghq.com/v1alpha1](http://charts.konghq.com/v1alpha1)
kind: Kong
metadata:
  name: kong
  namespace: kong
spec:
  admin:
    enabled: true
    http:
      enabled: true
    type: NodePort
  cluster:
    enabled: true
    tls:
      containerPort: 8005
      enabled: true
      servicePort: 8005
  clustertelemetry:
    enabled: true
    tls:
      containerPort: 8006
      enabled: true
      servicePort: 8006
  enterprise:
    enabled: true
    license_secret: kong-enterprise-license
    portal:
      enabled: false
    rbac:
      enabled: false
    smtp:
      enabled: false
  env:
    cluster_cert: /etc/secrets/kong-cluster-cert/tls.crt
    cluster_cert_key: /etc/secrets/kong-cluster-cert/tls.key
    database: postgres
    role: control_plane
  image:
    repository: kong/kong-gateway
    tag: 2.8.0.0-alpine
  ingressController:
    enabled: true
    image:
      repository: kong/kubernetes-ingress-controller
      tag: 2.2.1
    installCRDs: false
  manager:
    enabled: true
    type: NodePort
  postgresql:
    enabled: true
    postgresqlDatabase: kong
    postgresqlPassword: kong
    postgresqlUsername: kong
    securityContext:
      fsGroup: ""
      runAsUser: 1000660000
  proxy:
    enabled: true
    type: ClusterIP
  secretVolumes:
  - kong-cluster-cert

From here, we can do a helm get values kong -n kong to see which values were used when the Operator deployed the helm chart and we see the problem.

unifiedRepoTag: [registry.connect.redhat.com/kong/kong@sha256:95848027a62e13abb7172840b930a1e0bcaf37554fd6c948ea7441f59be7146f](http://registry.connect.redhat.com/kong/kong@sha256:95848027a62e13abb7172840b930a1e0bcaf37554fd6c948ea7441f59be7146f)
ingressController:
  enabled: true
  image:
    repository: kong/kubernetes-ingress-controller
    tag: 2.2.1
    unifiedRepoTag: [registry.connect.redhat.com/kong/kong-ingress-controller@sha256:45230b6671f375bbe9f524e748702b112dcc1f3f5e8716a276e887ad2944bf33](http://registry.connect.redhat.com/kong/kong-ingress-controller@sha256:45230b6671f375bbe9f524e748702b112dcc1f3f5e8716a276e887ad2944bf33)
  installCRDs: false
manager:
  enabled: true
  type: NodePort
postgresql:
  enabled: true
  postgresqlDatabase: kong
  postgresqlPassword: kong
  postgresqlUsername: kong
  securityContext:
    fsGroup: ""
    runAsUser: 1000660000
proxy:
  enabled: true
  type: ClusterIP
secretVolumes:
- kong-cluster-cert
waitImage:
  unifiedRepoTag: [registry.access.redhat.com/ubi8/ubi@sha256:910f6bc0b5ae9b555eb91b88d28d568099b060088616eba2867b07ab6ea457c7](http://registry.access.redhat.com/ubi8/ubi@sha256:910f6bc0b5ae9b555eb91b88d28d568099b060088616eba2867b07ab6ea457c7)

The unifiedRepoTag is overlayed over the images, forcing the images to be pulled from the OpenShift registry. (As expected from the operator)

To look at the images from the kong deployment (RH Registry images):

k get deploy -n kong kong-kong -oyaml | grep image:

image: [registry.connect.redhat.com/kong/kong-ingress-controller@sha256:45230b6671f375bbe9f524e748702b112dcc1f3f5e8716a276e887ad2944bf33](http://registry.connect.redhat.com/kong/kong-ingress-controller@sha256:45230b6671f375bbe9f524e748702b112dcc1f3f5e8716a276e887ad2944bf33)
image: [registry.connect.redhat.com/kong/kong@sha256:95848027a62e13abb7172840b930a1e0bcaf37554fd6c948ea7441f59be7146f](http://registry.connect.redhat.com/kong/kong@sha256:95848027a62e13abb7172840b930a1e0bcaf37554fd6c948ea7441f59be7146f)
image: [registry.connect.redhat.com/kong/kong@sha256:95848027a62e13abb7172840b930a1e0bcaf37554fd6c948ea7441f59be7146f](http://registry.connect.redhat.com/kong/kong@sha256:95848027a62e13abb7172840b930a1e0bcaf37554fd6c948ea7441f59be7146f)

The problem is not that it is pulling from the Red Hat repo, but that the images do not seem to be correct. We can verify this by checking the Kong version (We specified 2.8.0.0-alpine in the helm chart and the kong operator).

bash-3.2$ http $(k get routes -n kong kong-kong-admin -ojsonpath='{.status.ingress[0].host}') | jq -r .version

2.4.0 # this is should be 2.8.0.0-enterprise

Now, to double check by uninstalling the operator and installing the helm chart again:

bash-3.2$ http $(k get routes -n kong kong-kong-admin -ojsonpath='{.status.ingress[0].host}') | jq -r .version:

2.8.0.0-enterprise-edition
rainest commented 2 years ago

@cmwylie19 this was necessary to handle the requirement listed at https://redhat-connect.gitbook.io/certified-operator-guide/troubleshooting-and-resources/offline-enabled-operators#override-the-image-variable, and the image is being set from https://github.com/Kong/kong-operator/blob/daf53502f16a2531349b97a92ee5ce0b94cf7f9f/watches.yaml#L7-L9

Per Red Hat, the implementation of that overrideValues feature requires a single value for both the image URL and tag. This is not a Helm requirement, and we'd originally written our values.yaml to use separate values.

Switching to a single value outright would have been a risky breaking change to our chart configuration, as existing users image.repository and image.tag would no longer be honored, forcibly upgrading their image versions if they did not see that they needed to change the key. We thus opted not to do this and added unifiedRepoTag as an alternative key that would override repository and tag.

While unifiedRepoTag isn't normally mentioned in documentation, my understanding is that this is fine, because these are then instead set from the operator Deployment's environment: https://github.com/Kong/kong-operator/blob/daf53502f16a2531349b97a92ee5ce0b94cf7f9f/olm/0.10.0/kong.v0.10.0.clusterserviceversion.yaml#L163-L168

Is that not the expected workflow in this case? I also find separating the images out of values.yaml confusing from a UX standpoint, but that is what the Red Hat docs indicate we should do.

Aside from that, do you have recommendations for effectively managing both community/operatorhub.io and Red Hat Marketplace variants of an operator? Marketplace variants require changes to core operator files rather than using supplementary configuration, which has been a barrier to us updating the Marketplace version consistently: there doesn't appear to be an obvious way to manage both without maintaining separate git repos or branches and dealing with merge conflicts, whereas we'd ideally like to maintain a single branch, tag once per version, and have build-time configuration handle the necessary differences between the two.

mpaulgreen commented 2 years ago

@rainest which version of the operator will pull the image of 2.8.0.0-alpine? The v0.10.0 pulls 2.5.0.0-alpine image.

rainest commented 2 years ago

The chart, and by extension, the operator, are not tied to specific Kong versions. The provided defaults include the current version at the time of release, they're configurable by changing the contents of values.yaml, the Kong resource, or the controller environment variables mentioned above depending on whether you're using the chart, community operator, or certified/Marketplace operator.

mpaulgreen commented 2 years ago

Thanks, @rainest. Will give it a try

cmwylie19 commented 2 years ago

Thank you! We were successful after:

Patch kong-operator deployment update RELATED_IMAGE_KONG.

kubectl patch deploy/kong-operator -n openshift-operators  -p "{\"spec\": { \"template\" : { \"spec\" : {\"containers\":[{\"name\":\"kong-operator\",\"env\": [{ \"name\" : \"RELATED_IMAGE_KONG\", \"value\": 
\"kong/kong-gateway:2.8.0.0-alpine\" }]}]}}}}"
kubectl patch deploy/kong-operator -n openshift-operators  -p "{\"spec\": { \"template\" : { \"spec\" : {\"containers\":[{\"name\":\"kong-operator\",\"env\": [{ \"name\" : \"RELATED_IMAGE_KONG_CONTROLLER\", \"value\": 
\"kong/kubernetes-ingress-controller:2.2.1\" }]}]}}}}"

and setting unifiedImageTags in the Kong Operator instances:

kubectl apply -f -<<EOF
apiVersion: charts.konghq.com/v1alpha1
kind: Kong
metadata:
  name: kong
  namespace: kong
spec:
  admin:
    enabled: true
    http:
      enabled: true
    type: NodePort
  cluster:
    enabled: true
    tls:
      containerPort: 8005
      enabled: true
      servicePort: 8005
  clustertelemetry:
    enabled: true
    tls:
      containerPort: 8006
      enabled: true
      servicePort: 8006
  enterprise:
    enabled: true
    license_secret: kong-enterprise-license
    portal:
      enabled: true
    rbac:
      admin_gui_auth_conf_secret: admin-gui-session-conf
      enabled: true
      session_conf_secret: kong-session-config
    smtp:
      enabled: false
  env:
    cluster_cert: /etc/secrets/kong-cluster-cert/tls.crt
    cluster_cert_key: /etc/secrets/kong-cluster-cert/tls.key
    database: postgres
    password:
      valueFrom:
        secretKeyRef:
          key: password
          name: kong-enterprise-superuser-password
    portal_gui_protocol: http
    role: control_plane
  image:
    unifiedRepoTag: kong/kong-gateway:2.8.0.0-alpine
    repository: kong/kong-gateway
    tag: 2.8.0.0-alpine
  ingressController:
    enabled: true
    env:
      enable_reverse_sync: true
      kong_admin_token:
        valueFrom:
          secretKeyRef:
            key: password
            name: kong-enterprise-superuser-password
      sync_period: 1m
    image:
      repository: kong/kubernetes-ingress-controller
      tag: 2.2.1
      unifiedRepoTag: kong/kubernetes-ingress-controller:2.2.1
    installCRDs: false
  manager:
    enabled: true
    type: NodePort
  portal:
    enabled: true
    http:
      enabled: true
    type: NodePort
  portalapi:
    enabled: true
    http:
      enabled: true
    type: NodePort
  postgresql:
    enabled: true
    postgresqlDatabase: kong
    postgresqlPassword: kong
    postgresqlUsername: kong
    securityContext:
      fsGroup: ""
      runAsUser: 1000670000
  proxy:
    enabled: true
  secretVolumes:
  - kong-cluster-cert
EOF
rainest commented 2 years ago

:+1: closed per the above.