use kustomize replacement-style path splitting in `set-image`'s `additionalImageFields`

zevisert commented 2 years ago

Describe your problem

A lot of the same motivation from replacements in kustomize would be useful in set-image's additionalImageFields.path.

We have additionalImageFields to handle CRDs, but we don't support filtering in sequences, take this made-up-resource for example:

apiVersion: example.dev/unreleased
kind: InstanceList
metadata:
  name: my-instances
spec:
  instances:
    - name: instance1
      image: example.dev/image:alpha
    - name: instance2
      image: example.dev/image:alpha

Say I want to use kpt to change example.dev/image:alpha to docker.io/library/hello-world:latest, my options with this resource in the kpt toolchain are apply-setters, or search-replace, but really it should be done with set-image.

Using set-image it seems like it's only possible to process each item in the sequence because set-image uses sigs.k8s.io/kustomize/api@v0.11.0/filters/imagetag.Filter, which only supports syntax like spec/instances[]/image, but in cases like this I actually need the kustomize replacement-style filters, like spec.instances.[name=instance1].image implemented here sigs.k8s.io/kustomize/api@v0.11.0/internal/utils#SmarterPathSplitter.

droot commented 2 years ago

One way to make set-image universal will be to change it behavior to: It can act on any nested object at any arbitrary level in KRM object if the nested object has a shape that contains name and image field and one can filter on the basis of name or image field to set the new values.

....
....
.... - name:
        image:

/cc @yuwenma @natasha41575

zevisert commented 2 years ago

Thanks for the triage! I thought about making a feature request for the recursive image approach like that first, but went for what I submitted here instead.

In my example, recursively finding image would change the image at name=instance2, when I only wanted to change the one at instance1. That said, I think that's also better than what set-image does right now.

Looking at you CronJob, and your .spec.jobTemplate.spec.template.spec.containers.*.image which doesn't appear in the defaults...

But, as a user I'd appreciate the control of the replacement-style filters the most.

droot commented 2 years ago

In my example, recursively finding image would change the image at name=instance2, when I only wanted to change the one at instance1. That said, I think that's also better than what set-image does right now.

Yes, being able to target specific instance will be needed.

But, as a user I'd appreciate the control of the replacement-style filters the most.

Good to know.

We are actively collecting feedback on set-image as plan to work on version 2 of this function.

What does the workflow for updating images in your setup look like ? (Feel free to share as much as context possible, we love the details :))

zevisert commented 2 years ago

I'm going beyond my actual needs a little bit just to illustrate how I'd like / expect additionalImageFields to work (even though in my specific actual use case the recursive suggestion would work fine, I just think it'd lead to confusion in cases like the one I brought up here).

My workflow, heh well... 😅

We use kpt (along with set-image) in our gitlab-ci pipelines to deploy our dynamic review environments. Our KRM lives in our repository ready to deploy for developers, then gitlab-ci runs kpt to make changes for our merge (aka pull) request review environments as well as our staging and prod environments. A new image is built in each pipeline and set-image is used to substitute `:latest` for `:tag@sha256:digest` (tag not needed I know, it's just nice for readability). Our application is partly a k8s controller which can deploy cronjobs, which is why we use `additionalImageFields` -- to tell our application with which image it should use when creating new cronjobs, but also in #3573 to configure a `Job` to migrate the existing cronjobs to the new image. We have some other oddities around our usage of kpt in gitlab-ci that are not part of the standard workflow, but are interesting solutions nonetheless: #### 1. Our Kptfile is a template. Since we use kpt to deploy dynamic environments, we use `apply-setters` pretty extensively. But to configure each environment, we need to set some dynamic things like a environment name, namespace, ingress url, etc. To do that, our Kptfile is an `envsubst` template with all function-configs as configmaps inline in the Kptfile, and we populate it in two passes to make sure nothing was undefined. That happens like this: ```shell # Apply envsubst, but only allow replacement of defined variables. envsubst "$(env | cut -d= -f1 | sed -e 's/^/$/')" < kubernetes/Kptfile > kubernetes/Kptfile.prep # Envsubst again, now allowing leftover variables to be set as empty. envsubst < kubernetes/Kptfile.prep > kubernetes/Kptfile.check # If any variables changed in the second pass, they were undefined, and we should fail the job. # If we fail the job, still rename the rendered Kptfile to upload as an artifact for inspection / reproducing. if cmp --silent -- kubernetes/Kptfile.prep kubernetes/Kptfile.check; then rm kubernetes/Kptfile.check; mv kubernetes/Kptfile.prep kubernetes/Kptfile; else echo "envsubst operation performed with some undefined variables. See diff:"; diff kubernetes/Kptfile.prep kubernetes/Kptfile.check; rm kubernetes/Kptfile.check; mv kubernetes/Kptfile.prep kubernetes/Kptfile; exit 1; fi ``` We go through all of this work since we get synchronous apply and reconcile, as well as resource pruning. We only get 1/3 of those with `kubectl apply`. Also, as noted in the snippet above, we save the hydrated kptfile as a pipeline artifact, so that anyone can reproduce later, to that end we also use the dynamic environment name to set the inventory id and name in a per-environment-reproduceable way. #### 2. We really only use go functions and `--allow-exec` Since we (are probably one of the few teams who) run kpt in our continuous integration pipelines which themselves are containers running on a kube cluster, we wanted to figure out if we could run kpt without needing docker-in-docker and therefore privileged mode. This is pretty hacky, but we had two options: a) somehow use kpt function as separate ci jobs passing the intermediate KRM along between them, or b) bundle all the function binaries into one image and use `--allow-exec` to run them without invoking docker. We took the second approach, since we didn't want to pay the cost of provisioning 10-ish ci containers to do what is essentially one command. So, one of the stages for our ci tooling dockerfile looks roughly like this: ```dockerfile FROM setup as kpt ARG GET_KPT WORKDIR /opt/tooling/kpt/bin ADD ${GET_KPT} kpt RUN chmod +x kpt # Curated Functions COPY --from=gcr.io/kpt-fn/apply-replacements:v0.1.1 /usr/local/bin/function /opt/tooling/kpt/bin/apply-replacements:v0.1.1 COPY --from=gcr.io/kpt-fn/apply-setters:v0.2.0 /usr/local/bin/function /opt/tooling/kpt/bin/apply-setters:v0.2.0 COPY --from=gcr.io/kpt-fn/create-setters:v0.1.0 /usr/local/bin/function /opt/tooling/kpt/bin/create-setters:v0.1.0 COPY --from=gcr.io/kpt-fn/ensure-name-substring:v0.2.0 /usr/local/bin/function /opt/tooling/kpt/bin/ensure-name-substring:v0.2.0 COPY --from=gcr.io/kpt-fn/fix:v0.2.1 /usr/local/bin/function /opt/tooling/kpt/bin/fix:v0.2.1 COPY --from=gcr.io/kpt-fn/gatekeeper:v0.2.1 /usr/local/bin/validate /opt/tooling/kpt/bin/gatekeeper:v0.2.1 COPY --from=gcr.io/kpt-fn/list-setters:v0.1.0 /usr/local/bin/function /opt/tooling/kpt/bin/list-setters:v0.1.0 COPY --from=gcr.io/kpt-fn/render-helm-chart:v0.2.0 /usr/local/bin/function /opt/tooling/kpt/bin/render-helm-chart:v0.2.0 COPY --from=gcr.io/kpt-fn/search-replace:v0.2.0 /usr/local/bin/function /opt/tooling/kpt/bin/search-replace:v0.2.0 COPY --from=gcr.io/kpt-fn/set-annotations:v0.1.4 /usr/local/bin/function /opt/tooling/kpt/bin/set-annotations:v0.1.4 COPY --from=gcr.io/kpt-fn/set-enforcement-action:v0.1.0 /usr/local/bin/function /opt/tooling/kpt/bin/set-enforcement-action:v0.1.0 COPY --from=gcr.io/kpt-fn/set-image:v0.1.1 /usr/local/bin/function /opt/tooling/kpt/bin/set-image:v0.1.1 COPY --from=gcr.io/kpt-fn/set-labels:v0.1.5 /usr/local/bin/function /opt/tooling/kpt/bin/set-labels:v0.1.5 COPY --from=gcr.io/kpt-fn/set-namespace:v0.4.1 /usr/local/bin/function /opt/tooling/kpt/bin/set-namespace:v0.4.1 COPY --from=gcr.io/kpt-fn/starlark:v0.4.3 /usr/local/bin/function /opt/tooling/kpt/bin/starlark:v0.4.3 COPY --from=gcr.io/kpt-fn/upsert-resource:v0.2.0 /usr/local/bin/function /opt/tooling/kpt/bin/upsert-resource:v0.2.0 # GCP Functions COPY --from=gcr.io/kpt-fn/enable-gcp-services:v0.1.0 /usr/local/bin/function /opt/tooling/kpt/bin/enable-gcp-services:v0.1.0 COPY --from=gcr.io/kpt-fn/export-terraform:v0.1.0 /usr/local/bin/function /opt/tooling/kpt/bin/export-terraform:v0.1.0 COPY --from=gcr.io/kpt-fn/remove-local-config-resources:v0.1.0 /usr/local/bin/function /opt/tooling/kpt/bin/remove-local-config-resources:v0.1.0 COPY --from=gcr.io/kpt-fn/set-project-id:v0.2.0 /usr/local/bin/function /opt/tooling/kpt/bin/set-project-id:v0.2.0 # TODO: Typescript/Node based functions # - These are tough to import here, since we need all the node_modules, etc. # - Deno can probably solve this nicely now with npm package support and `deno compile` # generate_folders # kubeval RUN echo '#!/usr/bin/env bash\necho "$(basename $0) is not implemented in this environment" > /dev/stderr; exit 1;' \ > /opt/tooling/kpt/bin/not-implemented \ && chmod +x /opt/tooling/kpt/bin/not-implemented \ && { for NA in 'kubeval:v0.3.0' 'generate-folders:v0.1.1'; do ln -s /opt/tooling/kpt/bin/not-implemented "/opt/tooling/kpt/bin/$NA"; done } ``` The workdir from this stage is later copied into the image we actually run our CI pipelines in. We keep this as a bit of an implementation detail though, and we write and version control our Kptfile "template" using `pipeline.*.image` instead of `pipeline.*.exec` - that just gets changed with `sed`. So, in CI we actually invoke kpt roughly like this: ```shell # Here In CI: use exec mode for pipeline stages instead of the function-images to circumvent docker-in-docker sed --regexp-extended --expression 's|image: (["'\'']?)gcr.io/kpt-fn/(.+?)\1|exec: \1\2\1|g' --in-place kubernetes/Kptfile # Our "exec mode" CI image doesn't support node based functions, which for us includes kubeval yq --inplace eval 'del(.pipeline.validators[] | select(.exec | contains("kubeval")))' kubernetes/Kptfile # Run the pipeline in the kpt file in place kpt fn render kubernetes --truncate-output=false --allow-exec ``` No docker-in-docker required now, but we limited ourselves to specific versions of functions from the function catalog only. --- Big write up, I've been meaning to blog about this, so I guess thanks for being a guinea pig for that :) If you want to know more about all of this feel free to sync up with me [out-of-band](https://calendarhero.to/meetingzev)

droot commented 2 years ago

Thank you so much @zevisert for the detailed post.

I didn't know that we can :tag@sha256:digest, that's pretty neat. I also have a use-case where I want to preserve tag while replacing with the digest, so thank you for this detail.
Being able to surgically target a specific container image to be replaced in the package manifests is definitely needed. So thanks for the detailed explanation about your setup.
Yes, running functions in a nested containers is a big pain. We have done ton of explorations in this space. Most recent ones around using wasm as function-runtime has been very promising to avoid the docker dependency. We recently got alpha support merged in kpt and some functions also now support wasm target as well. Don't know if you noticed, we have been also working on a new component called porch (package orchestrator) which has a function-runner component that runs in a kubernetes cluster and that can also be leveraged to run functions in cluster.

Thanks for the offer for a call. I am definitely interested and would like some feedback on some of the things that are cooking, so will reach out when I a few things more concrete.

/cc @yuwenma some interesting insights related to set-image and functions in general.

zevisert commented 2 years ago

No worries!

I didn't know that we can :tag@sha256:digest, that's pretty neat. I also have a use-case where I want to preserve tag while replacing with the digest, so thank you for this detail.

FWIW: set-image rejects the configMap in red, but accepts the one in green:

  pipeline:
    mutators:
     - image: gcr.io/kpt-fn/set-image:v0.1.1
        configMap:
          name: hello-world:latest
-         newName: hello-world
-         newTag: linux
+         newName: hello-world:linux 
          digest: sha256:f54a58bc1aac5ea1a25d796ae155dc228b3f0e11d046ae276b39c4bf2f13d8c4

Most recent ones around using wasm as function-runtime has been very promising to avoid the docker dependency. We recently got alpha support merged in kpt and some functions also now support wasm target as well.

That sounds like a great use-case. Let me know where I can follow along.

Don't know if you noticed, we have been also working on a new component called porch (package orchestrator) which has a function-runner component that runs in a kubernetes cluster and that can also be leveraged to run functions in cluster.

I noticed porch, but we haven't tried it, or backstage either. I'm curious about it, and hope to find some time to fiddle with it soon-ish

kptdev / kpt

use kustomize replacement-style path splitting in `set-image`'s `additionalImageFields` #3580

Describe your problem