projectcontour / contour-operator

Experimental repository to explore an operator for deploying Contour
Apache License 2.0
43 stars 34 forks source link

Operator fails to create Contour and Envoy when image has long tag more than 63 characters #363

Closed nak3 closed 3 years ago

nak3 commented 3 years ago

What steps did you take and what happened:

0. Deploy contour-operator
1. Add --contour-image option with long tag image

e.g. image has 19b539fd22cdf70c67df81430e99c9c0c0a301948143f7cfc45471b367bccbf9 tag.

$ kubectl edit deploy -n contour-operator contour-operator
  ... 
      - args:
        - --metrics-addr=127.0.0.1:8080
        - --enable-leader-election
        - --contour-image gcr.io/gcp-compute-engine-223401/contour-ff924afcef2cd0177014d053ee4b167d@sha256:19b539fd22cdf70c67df81430e99c9c0c0a301948143f7cfc45471b367bccbf9
2. Contour and Envoy are not deployed.

contour-operator pod prints the following log.

2021-05-20T03:59:30.491Z    ERROR   controller-runtime.manager.controller.contour_controller    Reconciler error    {"name": "contour-internal", "namespace": "contour-internal", "error": "[failed to ensure deployment for contour contour-internal/contour-internal: failed to create deployment contour-internal/contour: failed to create deployment contour-internal/contour: Deployment.apps \"contour\" is invalid: metadata.labels: Invalid value: \"19b539fd22cdf70c67df81430e99c9c0c0a301948143f7cfc45471b367bccbf9\": must be no more than 63 characters, failed to ensure daemonset for contour contour-internal/contour-internal: failed to create daemonset contour-internal/envoy: DaemonSet.apps \"envoy\" is invalid: metadata.labels: Invalid value: \"19b539fd22cdf70c67df81430e99c9c0c0a301948143f7cfc45471b367bccbf9\": must be no more than 63 characters, failed to sync status for contour contour-internal/contour-internal: [failed to get deployment for contour contour-internal/contour-internal status: Deployment.apps \"contour\" not found, failed to get daemonset for contour contour-internal/contour-internal status: DaemonSet.apps \"envoy\" not found]]", "errorCauses": [{"error": "failed to ensure deployment for contour contour-internal/contour-internal: failed to create deployment contour-internal/contour: failed to create deployment contour-internal/contour: Deployment.apps \"contour\" is invalid: metadata.labels: Invalid value: \"19b539fd22cdf70c67df81430e99c9c0c0a301948143f7cfc45471b367bccbf9\": must be no more than 63 characters"}, {"error": "failed to ensure daemonset for contour contour-internal/contour-internal: failed to create daemonset contour-internal/envoy: DaemonSet.apps \"envoy\" is invalid: metadata.labels: Invalid value: \"19b539fd22cdf70c67df81430e99c9c0c0a301948143f7cfc45471b367bccbf9\": must be no more than 63 characters"}, {"error": "failed to sync status for contour contour-internal/contour-internal: [failed to get deployment for contour contour-internal/contour-internal status: Deployment.apps \"contour\" not found, failed to get daemonset for contour contour-internal/contour-internal status: DaemonSet.apps \"envoy\" not found]"}]}

What did you expect to happen:

Anything else you would like to add:

https://github.com/projectcontour/contour-operator/blob/f7f7e61fda0085853eb9f17571138a9243629e5d/internal/objects/daemonset/daemonset.go#L119 https://github.com/projectcontour/contour-operator/blob/f7f7e61fda0085853eb9f17571138a9243629e5d/internal/objects/deployment/deployment.go#L228

danehans commented 3 years ago

@stevesloka @skriss @sunjayBhatia should we simply trim the tag at 63 characters?

sunjayBhatia commented 3 years ago

@stevesloka @skriss @sunjayBhatia should we simply trim the tag at 63 characters?

I think TagFromImage assumes you are using <imagename>:<tag> rather than the image digest format: <imagename>@sha256:<digest>

it makes sense to me to be able to specify a specific image digest as a valid thing a user would do

sunjayBhatia commented 3 years ago

it isn't valid to use a shortened reference for an image digest or even tag when trying to create a container, and the lost info from trimming it would make it so you cant use that string to reference an actual image

might need to reconsider the pattern for the labels, does it need to be the actual version of the image in use or can it be something else

danehans commented 3 years ago

it isn't valid to use a shortened reference for an image digest or even tag when trying to create a container, and the lost info from trimming it would make it so you cant use that string to reference an actual image

@sunjayBhatia the issue is using the image as the value for the "app.kubernetes.io/version" label key per k8s recommendations.. The spec requires label values to be less than 63 characters.

sunjayBhatia commented 3 years ago

it isn't valid to use a shortened reference for an image digest or even tag when trying to create a container, and the lost info from trimming it would make it so you cant use that string to reference an actual image

@sunjayBhatia the issue is using the image as the value for the "app.kubernetes.io/version" label key per k8s recommendations.. The spec requires label values to be less than 63 characters.

right, the thing is just trimming a long image tag or image digest makes it so that string is no longer usable as a reference to an actual image, so you could maybe guess what the version of Contour is based on that label value but you could still technically be wrong

since this is potential API surface that people/tools can rely on, we should define what the goal of the label is, is it to be able to reference a Contour image explicitly? or can it be something made up that people shouldn't expect to reference an image with?

youngnick commented 3 years ago

In the maintainer meeting today, we agreed that we will remove the use of this label for now, so that it doesn't block creation of anything.

danehans commented 3 years ago

@youngnick I updated https://github.com/projectcontour/contour-operator/pull/369 to remove the "app.kubernetes.io/version" label, PTAL.