kptdev / kpt

Automate Kubernetes Configuration Editing
https://kpt.dev
Apache License 2.0
1.7k stars 228 forks source link

Flesh out the input data model and patterns #3396

Open bgrant0607 opened 2 years ago

bgrant0607 commented 2 years ago

Topic that needs more work.

We've figured out some aspects and requirements of package / function inputs:

But we don't have a fully fleshed out model or recommended patterns yet.

kpt isn't the first config tool to encounter these issues. We should look at data-oriented, non-package-parameter-based models for inspiration.

Some examples:

Additional thoughts or findings should be posted back here.

cc @justinsb @johnbelamaric @droot @yuwenma

bgrant0607 commented 2 years ago

Related: When gathering inputs, we may need to allow network access: #2450. And probably a way to provide credentials.

bgrant0607 commented 2 years ago

It's also worth mentioning kustomize components: https://github.com/kubernetes-sigs/kustomize/blob/master/examples/components.md https://github.com/kubernetes/enhancements/blob/master/keps/sig-cli/1802-kustomize-components/README.md

johnbelamaric commented 2 years ago

Related: When gathering inputs, we may need to allow network access: #2450. And probably a way to provide credentials.

Do we need to solve this in the CLI case / with kpt functions? That is, could more complex cases like this be handled instead only in the Porch incarnation of CaD, where we can build controllers that interact with other systems in any way we want? If an interactive CLI based session requires network reach out, then it can more easily fail, for example. Also, there are interactions we will never be able to handle that way - for example, imagine that getting an input requires filing a ticket, which a human then responds to. In the controller case, we can handle this sort of arbitrary-time-delay without any trouble. But it won't work at all in the interactive kpt fn render case.

bgrant0607 commented 2 years ago

@johnbelamaric I don't expect inputs to be generated during the kpt fn render pipeline, in general. It may consume the inputs. Input generation / gathering likely needs to be decoupled. Interactive forms or prompts is one such example.

Your ticket example is a good one, thanks. If you think of others, post them here.

johnbelamaric commented 2 years ago

A few quick thoughts, all slight variations on "fetch from external system":

johnbelamaric commented 2 years ago

Not all of these are necessarily only "function inputs". They could simply be ways of setting field values. For the example in the IPAM case, I can imagine a couple different approaches (this applies to others too, probably).

Reading that over, the second approach is probably more maintainable and flexible.

bgrant0607 commented 2 years ago

CMDB is an example use case for dynamic inventory in ansible, such as via inventory plugins and inventory scripts.

In addition to querying inputs dynamically, adapting input data locations / schemas to expected function input locations / schemas (or, in the case of IaC, to parameters of off-the-shelf packages) appears to be one of the other core / common issues.

bgrant0607 commented 2 years ago

Example from slack: https://kubernetes.slack.com/archives/C0155NSPJSZ/p1658760504705309

How to provide information to packages automatically.

bgrant0607 commented 2 years ago

The idea of "decorations" was discussed in the app config issue: https://github.com/GoogleContainerTools/kpt/issues/3351#issuecomment-1190399974 https://github.com/GoogleContainerTools/kpt/issues/3351#issuecomment-1192052502

kubectl expose and autoscale are examples of this.

Resource creation might be imperative, but this does raise the issue of using information from resources themselves as function inputs.

In the ghost package, we're experimenting with that approach as a way to propagate the host name: https://github.com/GoogleContainerTools/kpt/pull/3403/files

We could also use the approach to read resource requests and set application resource-dependent settings accordingly: https://github.com/GoogleContainerTools/kpt/issues/3210#issuecomment-1194882090

In order to be understandable there probably needs to be an intuitive source of truth. A potential advantage of the approach is that the source of truth could be well known, as opposed to an input to an arbitrary function. However, if multiple locations disagreed and the source of truth were ambiguous, then the user would need to be asked to resolve the inconsistency, as when providing multiple values in an undiscriminated union.

This approach could have implications for update strategies.

yuwenma commented 2 years ago

A Tekton example from Slack: https://github.com/marniks7/chaos-catalog

Slack message: https://kubernetes.slack.com/archives/C0155NSPJSZ/p1661457969525029?thread_ts=1661311193.053569&cid=C0155NSPJSZ More example for non-KRM file: https://github.com/GoogleContainerTools/kpt/issues/2350#issuecomment-1228000792

bgrant0607 commented 6 months ago

Example from another domain: https://support.microsoft.com/en-us/office/use-mail-merge-for-bulk-email-letters-labels-and-envelopes-f488ed5b-b849-4c11-9cff-932c49474705