Open renaudguerin opened 1 year ago
This issue is currently awaiting triage.
SIG CLI takes a lead on issue triage for this repo, but any Kubernetes member can accept issues by applying the triage/accepted
label.
The triage/accepted
label can be added by org members by writing /triage accepted
in a comment.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
Very sad to see basically our exact use-case for such an option go unheard.
Did you come up with a solution or workaround that you were happy with? This is holding up our upgrade of ArgoCD because they've moved on to kustomize >5.0.
Did you come up with a solution or workaround that you were happy with? This is holding up our upgrade of ArgoCD because they've moved on to kustomize >5.0.
Unfortunately, no. We are slowly coming to the conclusion that Kustomize maintainers seem more interested in making a work of art and paragon of software purity, rather than a tool that is powerful enough to address moderately complex real world scenarios on its own.
Version after version, they unabashedly plug loopholes or "unintended behaviors" that users relied on for some much needed flexibility, and provide no credible alternative.
I've just read this issue again : I can't believe I had to jump through so many hoops in the first place (get creative with replacements, components, a ConfigMapGenerator that creates an ephemeral resource then a patch that deletes it), all for the modest goal of : replacing a friggin' GCP Project ID across manifests that are otherwise identical between overlays. And... they managed to break even that in 5.0.
Look, I know complexity often comes from stubbornly using a tool against its design philosophy. I'd love to be told how I'm "holding it wrong" and how to fulfill the extremely common real world need described in this issue (patch a value across many resources wherever it is found, without having to explicit list each location) the "Kustomize way" without extra tooling.
Because what I'm not going to do is write a custom ArgoCD Config Management plugin to add a "non-structured search & replace" step before Kustomize (suddenly our manifests are no longer valid YAML), or a Kustomize Go plugin that I'll need to maintain and distribute across our systems, just so I can end up with a friggin' different GCP Project ID per environment in a DRY manner.
Such basic stuff needs to be native if Kustomize is to be used as a self-sufficient solution in any kind of non-trivial GitOps setup.
I'm genuinely open to the idea that I'm missing something : in search of answers I watched one of @KnVerey's presentations. I came to the conclusion that Kustomize is suitable for either trivial setups, or very large ones like the one she describes at Shopify, where it's one composable part of a pipeline together with automation generating the actual Kustomize manifests from a higher level app definitions.
But the use case of relying solely on Kustomize with developer-maintained DRY resource manifests in a moderately complex GitOps setup is not well catered for, and seems to be a blind spot for the maintainers. I'd rather not go back to Helm, but it seems to be the pragmatic choice in this situation. Any other suggestions very welcome...
@renaudguerin Yeah, we're on the same page here.
Kustomize does have a massive gap in how last-mile cluster configuration is supposed to be achieved. Replacements was the closest we came to it and even then it had quirk that made it difficult to work with.
Right now my workaround is to continue using Kustomize 4.5.7 for applications that require replacements in ArgoCD.
@renaudguerin I've done some tinkering in this area over the past couple weeks. I have now opted to use ArgoCD Vault Plugin as a psuedo-templating engine for last-mile configuration.
It allows you to put in well-known placeholder values in your manifests that the plugin can use to fetch the values from a secret store. In my case I'm using the Kubernetes Secret store alongside External Secrets to fetch from AWS SSM.
The best part, you don't even need to be using ArgoCD, you can pipe any kind of input to it and receive the applyable output.
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/reopen
/remove-lifecycle rotten
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
@k8s-triage-robot: Closing this issue, marking it as "Not Planned".
/reopen
/remove-lifecycle rotten
@renaudguerin: Reopened this issue.
@renaudguerin I've been able to achieve this using https://kubectl.docs.kubernetes.io/guides/extending_kustomize/exec_krm_functions/ it works even better as one can include custom logic for rendering specs.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
Eschewed features
What would you like to have added?
Add an
ignoreMissing
flag in the ReplacementTransformer's options field. It would allow users to opt for the pre 5.0.0 / #4789 behavior where missing fields in a target resource were ignored instead of resulting in errors.Why is this needed?
This feature is needed to address a significant change in Kustomize 5.0.0 where replacements now fail if a targeted field is missing from a resource and
options.create
isn't set.The previous behavior was to ignore invalid targets, which allowed users to package commonly used replacements (such as GCP Project ID replacements) as widely reusable components, that could be imported from various service directories with similar-but-not-quite-identical resources, modifying only relevant fields and skipping missing ones without errors. In the absence of parameterized components, this allowed for much needed flexibility in broadly applying replacements to 90% similar resources.
The new behavior, where replacements fail if a target field is missing, significantly disrupts workflows that previously depended on silent skipping of non-matching replacements. This is especially problematic in scenarios like ours, where environment-specific GCP project IDs appear in various formats across Kubernetes manifests, necessitating a universal replacement approach.
Detailed Use Case
In our Kustomize codebase, we deal with environment-specific GCP project IDs in otherwise identical resources. These IDs can appear in multiple formats - as a standalone string, as
projects/GCP_PROJECT_ID
, or as part of a service account ID (service-account@GCP_PROJECT_ID.iam.gserviceaccount.com
), etc. A shared component performing generic replacements of these IDs in our Config Connector resource types is crucial for reducing repetition in our repository.Here is an example generic replacement I intended to use, which lists all possible targets for
GCP_PROJECT_ID
. From 5.0.0 it will throw an error if any of the fields in the selected targets are missing.Here's another generic replacement, covering a different format for the GCP project ID (
GCP_PROJECT_ID.iam.gserviceaccount.com
) (As a side note : breaking this down into several replacements is only necessary becauseoptions.delimiter
is quite limited and doesn't support regexes. And of course, this would be a trivial task with unstructured edits ;) )These generic replacement "recipes" are part of a
common
component, used by our service definitions through environment-specific wrapper components that add a Configmap with relevant source values for this environment, as referenced in the replacements :This environment-specific component is then imported in the corresponding service overlays, like this :
Can you accomplish the motivating task without this feature, and if so, how?
Only very inelegantly or with much repetition, AFAICT :
I could keep the generic replacements component idea but target the resources more precisely using
select
/reject
(realistically, it would have to be done by name. But regex support is broken by the same change, making that even harder). Also, adding specifics about the callers' resources in a component is an ugly case of leaky abstraction.I could stick to a generic replacement component that only applies the absolute lowest common denominator list of replacements that will work with each service that uses this component. But that means more service-specific replacements to be implemented, with a repetition of the source configMap (GCP Project ID in its different string incarnations) in the service overlays themselves, which is what we were trying to get away from in the first place. Also, the surface of lowest common denominators will shrink dramatically with each service we add, requiring the move of replacements previously used by other services in the services themselves. Sounds hellish.
I could give up on the idea of a generic "replacements" component, and apply replacements in each service. 90% of them would be repeated.
What other solutions have you considered?
Sticking to Kustomize <5.0, but our codebase requires this 5.1 behavior anyway.
Maintaining our own version of Kustomize with an
ignoreMissing
flag added in.Pushing for the introduction of parameterized components, or some other similar sort of flexibility-enhancing Kustomize feature.
Living with duplication in our code base, as a result of not being able to implement flexible enough reusable Kustomize components.
Giving up on a 100% Kustomize-native solution and introducing something like
envsubst
as a pre-processing step.Moving to Helm.
We haven't made a firm decision yet, but none of these options are appealing and I'd really like to give native Kustomize features a chance before we give up.
Anything else we should know?
My comment on #4789 explains the above perhaps more succinctly (apologies for the slightly passive-aggressive tone : written as I had just discovered this change which makes our solution fall apart.)
Both the consequences of this change and a flag that allows restoring the previous behavior have been discussed by several other users already in the comments of #4789 and #5128
Feature ownership