kubernetes-sigs / kustomize

Customization of kubernetes YAML configurations
Apache License 2.0
11.09k stars 2.26k forks source link

Add 'ignoreMissing' Flag to replacement options to allow opting for pre-5.0.0 behavior #5440

Open renaudguerin opened 1 year ago

renaudguerin commented 1 year ago

Eschewed features

What would you like to have added?

Add an ignoreMissing flag in the ReplacementTransformer's options field. It would allow users to opt for the pre 5.0.0 / #4789 behavior where missing fields in a target resource were ignored instead of resulting in errors.

Why is this needed?

This feature is needed to address a significant change in Kustomize 5.0.0 where replacements now fail if a targeted field is missing from a resource and options.create isn't set.

The previous behavior was to ignore invalid targets, which allowed users to package commonly used replacements (such as GCP Project ID replacements) as widely reusable components, that could be imported from various service directories with similar-but-not-quite-identical resources, modifying only relevant fields and skipping missing ones without errors. In the absence of parameterized components, this allowed for much needed flexibility in broadly applying replacements to 90% similar resources.

The new behavior, where replacements fail if a target field is missing, significantly disrupts workflows that previously depended on silent skipping of non-matching replacements. This is especially problematic in scenarios like ours, where environment-specific GCP project IDs appear in various formats across Kubernetes manifests, necessitating a universal replacement approach.

Detailed Use Case

In our Kustomize codebase, we deal with environment-specific GCP project IDs in otherwise identical resources. These IDs can appear in multiple formats - as a standalone string, as projects/GCP_PROJECT_ID, or as part of a service account ID (service-account@GCP_PROJECT_ID.iam.gserviceaccount.com), etc. A shared component performing generic replacements of these IDs in our Config Connector resource types is crucial for reducing repetition in our repository.

Here is an example generic replacement I intended to use, which lists all possible targets for GCP_PROJECT_ID. From 5.0.0 it will throw an error if any of the fields in the selected targets are missing.

# _components/replacements/common/gcp_project_id.yaml
# This replacement fills in the GCP project ID (e.g. development-1234567) in places where it can be easily delimited.

source:
  kind: ConfigMap
  name: replacements
  fieldPath: data.GCP_PROJECT_ID
targets:
  # IAMPolicyMember external resourceRefs (projects/PROJECT_ID)
  - select:
      kind: IAMPolicyMember
    fieldPaths:
      - spec.resourceRef.external
    options:
      delimiter: "/"
      index: 1
  # PubSubSchema external projectRef (PROJECT_ID)
  - select:
      kind: PubSubSchema
    fieldPaths:
      - spec.projectRef.external

Here's another generic replacement, covering a different format for the GCP project ID (GCP_PROJECT_ID.iam.gserviceaccount.com) (As a side note : breaking this down into several replacements is only necessary because options.delimiter is quite limited and doesn't support regexes. And of course, this would be a trivial task with unstructured edits ;) )

# _components/replacements/common/gcp_sa_domain.yaml
# This replacement fills in the GCP service account domain (e.g. development-1234567.iam.gserviceaccount.com).
# In most cases it only replaces the part after the @ sign with the GCP_SA_DOMAIN value from the ConfigMap, and keeps the service account name intact.
# The GCP_PROJECT_ID replacement is too generic for this purpose, because there can be only one delimiter and index per replacement target.
source:
  kind: ConfigMap
  name: replacements
  fieldPath: data.GCP_SA_DOMAIN
targets:
  # iam.gke.io/gcp-service-account annotations (service-account@domain)
  - select:
      kind: ServiceAccount
    fieldPaths:
      - metadata.annotations.[iam.gke.io/gcp-service-account]
    options:
      delimiter: "@"
      index: 1
  # IAMPolicyMember member field (service-account@domain)
  - select:
      kind: IAMPolicyMember
    fieldPaths:
      - spec.member
    options:
      delimiter: "@"
      index: 1

These generic replacement "recipes" are part of a common component, used by our service definitions through environment-specific wrapper components that add a Configmap with relevant source values for this environment, as referenced in the replacements :

# _components/replacements/development/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1alpha1
kind: Component

configMapGenerator:
  - name: replacements
    literals:
      - "GCP_PROJECT_ID=development-1234567"
      - "GCP_SA_DOMAIN=development-1234567.iam.gserviceaccount.com"

components:
  - ../common

# Clean up the ConfigMap after applying replacements
patches:
- patch: |-
    $patch: delete
    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: replacements

This environment-specific component is then imported in the corresponding service overlays, like this :

# services/myservice/overlays/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
  - ../../base

components:
  - ../../../_components/replacements/development

# [More definitions here]

Can you accomplish the motivating task without this feature, and if so, how?

Only very inelegantly or with much repetition, AFAICT :

What other solutions have you considered?

We haven't made a firm decision yet, but none of these options are appealing and I'd really like to give native Kustomize features a chance before we give up.

Anything else we should know?

My comment on #4789 explains the above perhaps more succinctly (apologies for the slightly passive-aggressive tone : written as I had just discovered this change which makes our solution fall apart.)

Both the consequences of this change and a flag that allows restoring the previous behavior have been discussed by several other users already in the comments of #4789 and #5128

Feature ownership

k8s-ci-robot commented 1 year ago

This issue is currently awaiting triage.

SIG CLI takes a lead on issue triage for this repo, but any Kubernetes member can accept issues by applying the triage/accepted label.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
k8s-triage-robot commented 9 months ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

bagel-dawg commented 9 months ago

Very sad to see basically our exact use-case for such an option go unheard.

Did you come up with a solution or workaround that you were happy with? This is holding up our upgrade of ArgoCD because they've moved on to kustomize >5.0.

renaudguerin commented 9 months ago

Did you come up with a solution or workaround that you were happy with? This is holding up our upgrade of ArgoCD because they've moved on to kustomize >5.0.

Unfortunately, no. We are slowly coming to the conclusion that Kustomize maintainers seem more interested in making a work of art and paragon of software purity, rather than a tool that is powerful enough to address moderately complex real world scenarios on its own.

Version after version, they unabashedly plug loopholes or "unintended behaviors" that users relied on for some much needed flexibility, and provide no credible alternative.

I've just read this issue again : I can't believe I had to jump through so many hoops in the first place (get creative with replacements, components, a ConfigMapGenerator that creates an ephemeral resource then a patch that deletes it), all for the modest goal of : replacing a friggin' GCP Project ID across manifests that are otherwise identical between overlays. And... they managed to break even that in 5.0.

Look, I know complexity often comes from stubbornly using a tool against its design philosophy. I'd love to be told how I'm "holding it wrong" and how to fulfill the extremely common real world need described in this issue (patch a value across many resources wherever it is found, without having to explicit list each location) the "Kustomize way" without extra tooling.

Because what I'm not going to do is write a custom ArgoCD Config Management plugin to add a "non-structured search & replace" step before Kustomize (suddenly our manifests are no longer valid YAML), or a Kustomize Go plugin that I'll need to maintain and distribute across our systems, just so I can end up with a friggin' different GCP Project ID per environment in a DRY manner.

Such basic stuff needs to be native if Kustomize is to be used as a self-sufficient solution in any kind of non-trivial GitOps setup.

I'm genuinely open to the idea that I'm missing something : in search of answers I watched one of @KnVerey's presentations. I came to the conclusion that Kustomize is suitable for either trivial setups, or very large ones like the one she describes at Shopify, where it's one composable part of a pipeline together with automation generating the actual Kustomize manifests from a higher level app definitions.

But the use case of relying solely on Kustomize with developer-maintained DRY resource manifests in a moderately complex GitOps setup is not well catered for, and seems to be a blind spot for the maintainers. I'd rather not go back to Helm, but it seems to be the pragmatic choice in this situation. Any other suggestions very welcome...

bagel-dawg commented 9 months ago

@renaudguerin Yeah, we're on the same page here.

Kustomize does have a massive gap in how last-mile cluster configuration is supposed to be achieved. Replacements was the closest we came to it and even then it had quirk that made it difficult to work with.

Right now my workaround is to continue using Kustomize 4.5.7 for applications that require replacements in ArgoCD.

bagel-dawg commented 8 months ago

@renaudguerin I've done some tinkering in this area over the past couple weeks. I have now opted to use ArgoCD Vault Plugin as a psuedo-templating engine for last-mile configuration.

It allows you to put in well-known placeholder values in your manifests that the plugin can use to fetch the values from a secret store. In my case I'm using the Kubernetes Secret store alongside External Secrets to fetch from AWS SSM.

The best part, you don't even need to be using ArgoCD, you can pipe any kind of input to it and receive the applyable output.

k8s-triage-robot commented 7 months ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot commented 6 months ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-ci-robot commented 6 months ago

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to [this](https://github.com/kubernetes-sigs/kustomize/issues/5440#issuecomment-2111642201): >The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. > >This bot triages issues according to the following rules: >- After 90d of inactivity, `lifecycle/stale` is applied >- After 30d of inactivity since `lifecycle/stale` was applied, `lifecycle/rotten` is applied >- After 30d of inactivity since `lifecycle/rotten` was applied, the issue is closed > >You can: >- Reopen this issue with `/reopen` >- Mark this issue as fresh with `/remove-lifecycle rotten` >- Offer to help out with [Issue Triage][1] > >Please send feedback to sig-contributor-experience at [kubernetes/community](https://github.com/kubernetes/community). > >/close not-planned > >[1]: https://www.kubernetes.dev/docs/guide/issue-triage/ Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.
renaudguerin commented 6 months ago

/reopen

renaudguerin commented 6 months ago

/remove-lifecycle rotten

k8s-ci-robot commented 6 months ago

@renaudguerin: Reopened this issue.

In response to [this](https://github.com/kubernetes-sigs/kustomize/issues/5440#issuecomment-2112930937): >/reopen Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.
boscard commented 4 months ago

@renaudguerin I've been able to achieve this using https://kubectl.docs.kubernetes.io/guides/extending_kustomize/exec_krm_functions/ it works even better as one can include custom logic for rendering specs.

k8s-triage-robot commented 1 month ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot commented 2 days ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten