Closed baijum closed 3 years ago
@baijum have you looked at https://pkg.go.dev/k8s.io/client-go/util/jsonpath#JSONPath.FindResults
@baijum have you looked at https://pkg.go.dev/k8s.io/client-go/util/jsonpath#JSONPath.FindResults
That function returns a slice of results. Should the implementation set value for each of the results?
Should the implementation set value for each of the results?
JSON Path can return multiple matching nodes for a query, the containers
, envs
and volumeMounts
keys all accept multiple json paths, so I'd lean towards treating each matching node as a target to be injected. This behavior is desirable because it allows us to bind into arrays that contain these structures (imagine something like containers that are not quite corev1.Container).
This raises an interesting question as what to do when there is an asymmetry between env and volumeMounts. The path of the volume mount needs to be based on the value of the SERVICE_BINDING_ROOT
env var. We have this issue regardless of jsonpath returning one or multiple values.
IMHO, JSONPath is not the right tool for specifying locations in a data structure - it does great job of filtering the values but it does not provide a link between the matched value and its location in the data structure. Hence, updating the value and reflecting that update on the original structure is not something what JSONPath speaks about. It is left to an implementation to try to support it or not.
When there is no match, the result is an empty list - Again, this does not give us any clue where relevant values should be injected if the data structure does not hold them.
In the current spec, Application Resource Mapping defines .envs
and .volumes
- usually they are optional in application resources. Hence, the provided JSONPath is going to return an empty list, and we are not able to figure where volume/env
should be injected.
I would propose that we replace JSONPaths with JSON Pointers - it a well known standard for locating data in JSON structures (e.g. among others kustomize
uses it )
to bind into arrays that contain these structures (imagine something like containers that are not quite corev1.Container).
As I understand, there are two primary use cases for Application Resource Mapping.
jobTemplate
)In both these cases, I don't see a requirement to support an array. BTW, the second example given for cronjob in the spec is not very realistic, because the previous example has a better solution.
In the intermediate resource scenario, repeating the same data inside a nested data structure is not required.
I think dot-separated values of strings would be sufficient to handle the current use cases.
In order to support something like RuntimeComponentSpec
, we should introduce a notion of duck-typed PodSpecable resource, something what @baijum pointed out in https://github.com/kubepreset/kubepreset/wiki/Application-Resource-Mapping:
type PodSpecable struct {
InitContainers []corev1.Container `json:"initContainers,omitempty"`
Containers []corev1.Container `json:"containers,omitempty"`
Volumes []corev1.Volume `json:"volumes,omitempty"`
}
The structure can have any other fields, but if it has these, we know what to inject and where. The location to that structure can be specified via a new field:
apiVersion: service.binding/v1alpha2
kind: ClusterApplicationResourceMapping
metadata:
name: # string
generation: # int64, defined by the Kubernetes control plane
...
spec:
versions: # []Version
- version: # string
podSpecable: # jsonpointer or dot separated path
The expression pointing to the location of PodSpecable resource do not need to contain any wildcards, and with that we cover already a huge number of cases. IMHO, it is very unlikely to have containers
and volumeMounts
living in two very different part of resources, i.e. they have the common parent.
Hence, we can move from the need to support JSONPath, and use simpler standards, e.g. JSONPointer
@pedjak Can you please add a comment with the CR that you'd write for RuntimeComponentSpec
? A quick review of the CRD seems to indicate that it's not PodSpecable
by this definition.
@nebhale I see that the term podSpecable
here is misleading - we are not looking for resources that are exactly of type `PodSpec, we are looking for duck-typed structures, that have some of interesting fields:
type PodSpecable struct {
InitContainers []corev1.Container `json:"initContainers,omitempty"`
Containers []corev1.Container `json:"containers,omitempty"`
Volumes []corev1.Volume `json:"volumes,omitempty"`
}
Everything is optional. The rest of the fields in the found structure are not of interest for the operator.
For a RuntimeComponent CR like:
apiVersion: app.stacks/v1beta1
kind: RuntimeComponent
metadata:
name: my-app
spec:
applicationImage: quay.io/my-repo/my-app:1.0
service:
type: ClusterIP
port: 9080
expose: true
storage:
size: 2Gi
mountPath: "/logs"
CustomApplicationResourceMapping
could look like:
apiVersion: service.binding/v1alpha2
kind: ClusterApplicationResourceMapping
metadata:
name: runtimecomponents.app.stacks
...
spec:
versions: # []Version
- version: '*'
podSpecable: .spec
And then it is easy to create volumes
collection and an entry in it. If RuntimeComponent
CR declares initContainers
, then it is easy to inject a proper volumeMount
in them.
This works well for CronJob
mentioned in the spec, we could have following unique CustomApplicationResourceMapping
:
apiVersion: service.binding/v1alpha2
kind: ClusterApplicationResourceMapping
metadata:
name: cronjobs.batch
spec:
versions:
- version: "*"
podSpecable: .spec.jobTemplate.spec.template.spec
For a RuntimeComponent CR like:
apiVersion: app.stacks/v1beta1 kind: RuntimeComponent metadata: name: my-app spec: applicationImage: quay.io/my-repo/my-app:1.0 service: type: ClusterIP port: 9080 expose: true storage: size: 2Gi mountPath: "/logs"
CustomApplicationResourceMapping
could look like:apiVersion: service.binding/v1alpha2 kind: ClusterApplicationResourceMapping metadata: name: runtimecomponents.app.stacks ... spec: versions: # []Version - version: '*' podSpecable: .spec
How do you bind a service to the container with the image quay.io/my-repo/my-app:1.0
? Since it's not within []corev1.Container
it would seem to be inaccessible.
How do you bind a service to the container with the image
So, it seems that .spec
of RuntimeComponent
looks like corev1.Container
with a twist that it contains initContainer
and sideContainer
as well. We can add an arbitrary number of fields that could be interesting for the operator:
type PodSpecable struct {
InitContainers []corev1.Container `json:"initContainers,omitempty"`
Containers []corev1.Container `json:"containers,omitempty"`
Volumes []corev1.Volume `json:"volumes,omitempty"`
VolumeMounts [][]corev1.VolumeMount
Env []corev1.EnvVar
.
.
}
so in order to bind, the operator locates the structure that should be modified, start looking what fields are available and based on that adds, in this case an entry under volumes
, and volumeMounts
.
I agree that the current spec'd behavior is under defined and has issues, but I'm not convinced we need to take steps as dramatic as discussed here. Since the ClusterApplicationResourceMapping
is a power user feature that his focused on adding support for resource shapes we don't know about, I think we can more emphasis on flexibility over easy of use. Restricting support to resource that have a PodSpec is easy to implement, but ultimately, not as useful.
The essential capabilities we need to bind a service into an application workload are:
[]corev1.Volume
[]corev1.EnvVar
[]corev1.VolumeMapping
If we have access to a corev1.PodTemplateSpec
we can infer everything else and have a richer experaince. But if we require a PodTemplateSpec, then flexibility is lost.
We need the flexibility to find container-esque shapes wherever they may be in the spec; JSONPath is a good fit. We also need the reliability to know within a found container-esque object a firm reference to the env and volumeMounts (JSON Pointer). While I'm not crazy about requiring two distinct grammars, we can explore a limited subset of JSONPath that gives us a JSON Pointer like experiance.
apiVersion: service.binding/v1alpha2
kind: ClusterApplicationResourceMapping
metadata: # metav1.ObjectMeta
...
spec:
versions:
- version: # string
containers:
- path: # string(JSONPath)
name: # string(JSON Pointer), optional
env: # string(JSON Pointer), defaults to "/env"
volumeMounts: # string(JSON Pointer), defaults to "/volumeMounts"
volumes: # string(JSON Pointer)
For the RuntimeComponent
resource, it would look like:
apiVersion: service.binding/v1alpha2
kind: ClusterApplicationResourceMapping
metadata:
name: runtimecomponents.app.stacks
spec:
versions:
- version: *
containers:
- path: .spec
- path: .spec.initContainers[*]
name: /name
- path: .spec.sidecarContainers[*]
name: /name
volumes: /spec/volumes
What's interesting about RuntimeComponent is there are two obvious arrays of containers, but there's also a single implicit container in the root. This approach is flexible enough to capture all three.
A traditional PodSpecable resource like Deployment
would look like:
apiVersion: service.binding/v1alpha2
kind: ClusterApplicationResourceMapping
metadata:
name: deployments.apps
spec:
versions:
- version: *
containers:
- path: .spec.template.spec.containers[*]
name: /name
- path: .spec.template.spec.initContainers[*]
name: /name
volumes: /spec/template/spec/volumes
A non-PodSpecable resource like CronJob
would look like:
apiVersion: service.binding/v1alpha2
kind: ClusterApplicationResourceMapping
metadata:
name: cronjobs.batch
spec:
versions:
- version: *
containers:
- path: .spec.jobTemplate.spec.template.spec.containers[*]
name: /name
- path: .spec.jobTemplate.spec.template.spec.initContainers[*]
name: /name
volumes: /spec/template/spec/volumes
Restricting support to resource that have a PodSpec is easy to implement, but ultimately, not as useful.
This is not a real PodSpec
, it is just a duck-typed structure that can have some fields of PodSpec
- the fields that we know how to handle. IMHO, containers
, volumes
, and the other have the same parent, i.e. belong to the same structure - they are not scattered across application CR.
The essential capabilities we need to bind a service into an application workload are:
a single []corev1.Volume for each container: a single []corev1.EnvVar a single []corev1.VolumeMapping
With that we assume that all application CR semantic is the same as PodSpec
, just allowing that location and name of these fields might be different. What if application CR does not follow that semantic? What if the application CR hold just a reference to binding secret and we want to inject only that. The current abstraction does not allow us that.
We need the flexibility to find container-esque shapes wherever they may be in the spec; JSONPath is a good fit.
JSONPath is for querying data, but not for specifying location. IMHO, containers/pods are all to be found in the same part of the resource, so no need to use the wildcard operator, and thus we really do not need to use JSONPath. For example .spec.containers
returns the list of all containers. Similar can be achieved by JSONPointer /spec/container
@pedjak What concerns me about
We can add an arbitrary number of fields that could be interesting for the operator:
is that it requires the specification to change to support these use-cases. Especially if we expect the spec and implementations to be stable, we shouldn't require a change to them whenever a new unexpected CR comes into existence months or years from now.
is that it requires the specification to change to support these use-cases. Especially if we expect the spec and implementations to > be stable, we shouldn't require a change to them whenever a new unexpected CR comes into existence months or years from now.
Sure, I agree - what I meant is that we can come up with some sets of fields we know how to handle and put in the spec.
I pulled together a proof of concept for my proposal above to use JSONPath to discover containers and JSON Pointer to address pod and container level fields.
The implementation works by converting any runtime object into an unstructured form. A mapping definition specifies the location of key elements of the resource necessary for the binding. Using the mapping definition, the unstructured resource is converted into a normalized structured form. That normalized form can then be manipulated to apply the service bindings. Finally, the inverse conversion is applied using the same mapping definition to map the mutated state on to the original object.
For the moment, the only aspect of the service binding behavior I have implemented is the detection and defaulting of the SERVICE_BINDING_ROOT
env var. Full spec compliant behavior can be added here.
There are tests that apply the binding, mapping an object to the meta-model and mapping from the meta-model back to an object are tested on both a PodSpecable resource (Deployment) and non-PodSpecable resource (CronJob). Knowledge of these types is limited to the test suite.
There are zero special dependencies. This behavior is implemented using k8s.io/api
, k8s.io/apimachinery
and k8s.io/client-go
. github.com/google/go-cmp
is only used by the test suite to diff the expected and actual objects.
Binding a Deployment with a default mapping:
Binding a CronJob with a custom mapping:
tl;dr it works
Application Resource Mapping specifies JSONPath expressions to indentify
containers
,envs
,volumeMounts
, andvolumes
.The unstructured API of apimachinery package expects fields to set values for an object. For example, to set volumes:
I couldn't figure out a way to deterministically resolve the fields from a JSONPath expression. Is that possible? If not, we should reconsider JSONPath expression to specify fields.