Open justenstall opened 10 months ago
Thanks for raising this @justenstall
This caused me to think about https://twitter.com/stefanprodan/status/1710632837362147563
On that basis I'll copy in @stefanprodan and @sdboyer
Some of the features described here are currently implemented in timoni mod vendor crd. We currently can generate CUE schemas from CRD manifests made with controller-gen and kubebuilder markers, which cover most CNCF projects. We handle x-kubernetes-preserve-unknown-fields
which was quite challenging, the cue openapi package makes everything opened by default, instead only fields with this prop must be so, Sam found a way to deal with using kin-openapi. For CRDs that have anyOf
and oneOf
(luckily these are not supported by controller-gen so not many CRDs have them), the schema generated by cue is invalid (tracked in #2686). Next we'll have to deal with x-kubernetes-int-or-string
.
As for CEL, I agree that is a major undertake but I suspect most CRDs will switch to CEL in 2024. Kubernetes Gateway API already uses CEL, and we're also considering using CEL for FluxCD, guess many more project will follow. Given this, without a CEL to CUE translator, I think the value prop of using CUE for Kubernetes objects validation will be less attractive. To overcome this, I'm inclined into running CEL inside Timoni after CUE generates the final values, in the same way the Kubernetes API does, but it would be ideal for CUE to provide a CEL translator, at least for the expressions that are compatible.
Linking the following in case they could be of use:
I put together a rough demo for this functionality using the Timoni CRD importer as a starting point. By switching from kin-openapi
to Kubernetes' internal representation of a CRD it should hopefully be easier to work with their API extensions. The only additional functionality in the demo is that all Kubernetes extensions to OpenAPI (x-kubernetes-*
fields) are preserved in the resulting CUE as attributes.
I think this sort of implementation would be the best way to make CRD imports lossless and would allow the CRD importing functionality to be made available independent of a CEL to CUE translator, which could theoretically be added at a later date.
https://github.com/justenstall/cue/blob/get-crd/encoding/crd/decode.go
Example
For extensions.istio.io/v1alpha1.WasmPlugin
, the field spec.url
uses x-kubernetes-validations
:
url:
description: URL of a Wasm module or OCI container.
minLength: 1
type: string
x-kubernetes-validations:
- message: url must have schema one of [http, https, file, oci]
rule: 'isURL(self) ? (url(self).getScheme() in ['''', ''http'',
''https'', ''oci'', ''file'']) : (isURL(''http://'' + self) &&
url(''http://'' +self).getScheme() in ['''', ''http'', ''https'',
''oci'', ''file''])'
This is the resulting CUE in cue.mod/gen/extensions.istio.io/wasmplugin/v1alpha1/types_gen.cue
:
url: strings.MinRunes(1) @crd(validations="""
[{"rule":"isURL(self) ? (url(self).getScheme() in ['', 'http', 'https', 'oci', 'file']) : (isURL('http://' + self) \u0026\u0026 url('http://' +self).getScheme() in ['', 'http', 'https', 'oci', 'file'])","message":"url must have schema one of [http, https, file, oci]"}]
""")
@justenstall this looks promising, I would like to give this a try in Timoni and run the CEL rules using k8s.io/apiserver/pkg/apis/cel
.
PS. It would nice to preserve the original license headers when you copy/paste code from other projects.
@stefanprodan
@justenstall this looks promising, I would like to give this a try in Timoni and run the CEL rules using
k8s.io/apiserver/pkg/apis/cel
.
Is the thought that Timoni would run the CEL rules during timoni mod vet
/timoni bundle vet
?
PS. It would nice to preserve the original license headers when you copy/paste code from other projects.
My bad on the license headers, I'll fix that.
Is the thought that Timoni would run the CEL rules during timoni mod vet/timoni bundle vet?
I think is the only viable alternative to a CEL to CUE translator.
By switching from kin-openapi to Kubernetes' internal representation of a CRD it should hopefully be easier to work with their API extensions. T
I can't speak for the CUE team, personally I would not want Kubernetes as a dependency in CUE lang. This means the Kubernetes project can't use CUE as it would create a cyclic dependency, it also means that all the Kubernetes tools like Timoni would need to wait for CUE to bump the Kubernetes packages before they can upgrade. For example, Kubernetes 1.18 comes with a breaking change to their OpenAPI package, so all tools require a step upgrade. If your tool imports some package that depends on Kubernetes 1.17, you are stuck on 1.17 until all deps move to 1.18. This is the case with Flux, we are stuck on 1.17 for months now.
By switching from kin-openapi to Kubernetes' internal representation of a CRD it should hopefully be easier to work with their API extensions. T
I can't speak for the CUE team, personally I would not want Kubernetes as a dependency in CUE lang. This means the Kubernetes project can't use CUE as it would create a cyclic dependency, it also means that all the Kubernetes tools like Timoni would need to wait for CUE to bump the Kubernetes packages before they can upgrade. For example, Kubernetes 1.18 comes with a breaking to their openAPI package, so all tools require a step upgrade, if your tool imports some package that depends on Kubernetes 1.17, you are stuck on 1.17 until all deps move to 1.18. This is the case with Flux, we stuck on 1.17 for months now.
Dropping links for reference:
This is great and something I've wanted myself for a long time. A lot of these CRDs are generated from Go structs, so it would be great if it would be possible to convert CUE (from cue get go
) to a Kubernetes CRD.
@justenstall thanks for creating #2701 as a strawman for discussion.
I can't speak for the CUE team, personally I would not want Kubernetes as a dependency in CUE lang.
As a guiding rule, we want to keep the dependencies for cuelang.org/go
(and by extension cmd/cue
based on the current structure) small. Hence we would almost certainly not add Kubernetes as a dependency.
However, I think #2701 is a good strawman to act as a forcing function to ask the question: so where should this kind of adapter live? How could/should cmd/cue
(and more generally cue/load
and the various encoding/X
adapters) work with such a setup?
@myitcv
I agree with you and @stefanprodan that CUE shouldn't have a dependency on Kubernetes. I can vendor the data types from Kubernetes for the PoC, if there are any other ideas for how that should be done, let me know.
I do think there's a larger conversation to be had around the setup of where adapters/converters should live. If the dependencies for the CRD --> CUE conversion were isolated to the converter, I think it's logical that the converter would use Kubernetes' internal parser and data type for CRDs.
If converters could be isolated, they could rely on tailor-made parsers for JSON Schema/OpenAPI/CRDs/etc, which could have a lot of benefits:
The CRD example is somewhat unique in comparison to other importable formats (for instance, a dependency on an OpenAPI library would be as controversial), but in my opinion it's still a worthwhile conversation.
IMO, it would be amazing to see a more consistent set of encoders/adapters for other representations to/from CUE. i've spent considerable time writing such things over the past couple years, and can attest that:
cue.Value
Kubernetes Gateway API already uses CEL, and we're also considering using CEL for FluxCD, guess many more project will follow. Given this, without a CEL to CUE translator, I think the value prop of using CUE for Kubernetes objects validation will be less attractive.
I didn't know this was where things were headed (:hat-tip:), but to me, this is all the more reason that a CUE->CEL translator would be valuable. Otherwise, CUE's probably just out of the game here, regardless of its other potential benefits.
I'd add to this that CUE is a non-starter as the engine for executing validations within an apiserver as long as a single CUE runtime can't be safely used for validation from multiple goroutines.
@sdboyer I started a Slack thread for encoding discussion, would love to get your input: https://cuelang.slack.com/archives/CMY132JKY/p1701367946843919
I wonder if with the upcoming feature of cue modules and oci registries, a central cue registry like nixpkgs would make sense, which would host all the packages, like crds. Wouldn't that kind of eliminate commands, which are specifically tailored for certain types of files? Over the last year I also had a lot of troubles with importing crds to cue and always went with cue get go, but I think having crds in a registry and just use the dependency management makes it more "platform agnostic" than having commands for special cases.
I wonder if with the upcoming feature of cue modules and oci registries, a central cue registry like nixpkgs would make sense, which would host all the packages, like crds. Wouldn't that kind of eliminate commands, which are specifically tailored for certain types of files? Over the last year I also had a lot of troubles with importing crds to cue and always went with cue get go, but I think having crds in a registry and just use the dependency management makes it more "platform agnostic" than having commands for special cases.
@kharf
It could help the case but someone would still have to manually produce a CUE module for the CRD, which won't happen for every CRD, and won't be of consistent quality. If CUE could comprehend a CRD's schemas then every CRD becomes usable in CUE at a guaranteed level of quality. Anyone wanting an even better representation of the CRD can still publish a CUE module either starting from scratch or starting from generated CUE.
Is your feature request related to a problem? Please describe.
CUE cannot import Kubernetes CustomResourceDefinitions (CRDs). Importing CRDs into CUE would be a huge value add for Kubernetes users.
Describe the solution you'd like
I would like a
cue import crd
command that is able to import a YAML or JSON encoded CustomResourceDefinition to CUE.Describe alternatives you've considered
cue import openapi
Since CRDs use a variation of OpenAPIv3 to define their schema,
cue import openapi
can be used to import the schema with some extra help from a query tool like yq. Here is an example script to extract a valid OpenAPI definition from the CRD with yq and then import it as CUE:This is not a viable solution because of Kubernetes' OpenAPIv3 extensions explained in the Additional Context section below.
cue get go
Many CRDs are created with a code generation tool such as
controller-gen
(from the kubebuilder SDK), which generates the CRD from annotated Go code. I have usedcue get go
to import the CRD from the Go types, but it is not a 100% match for thecontroller-gen
-created schema. Adding direct support for CRD --> CUE is ideal.Additional context
OpenAPIv3 vs Structural Schema
Kubernetes calls its flavor of OpenAPIv3 "structural schema", which is defined as follows:
Source: Specifying a structural schema
CEL Validation
As seen in the snippet above, Kubernetes extends the OpenAPIv3 syntax with fields prefixed by
x-kubernetes-
. The most challenging to incorporate will be the newerx-kubernetes-validations
field, used to define validations rules with the Common Expression Language (CEL). Translating CEL to CUE will be a big lift, but it is being cemented as the desired way for k8s developers to define validation rules in CRDs, k8s admission controller webhooks, and eventually even the native Kubernetes APIs. CEL validation will be incompatible with CUE in some cases, because it allows validating an updated Kubernetes object by comparing it to its existing values. Ideally, validation rules that have no CUE equivalent can be preserved so the CRD can round-trip CRD --> CUE --> CRD without losing information.References:
Possible existing implementation?
The file encoding/openapi/crd.go contains a seemingly unused implementation of CRD decoding. If anyone is familiar with this code and knows what state it is in, I would appreciate the help as a new contributor.