tracking: CRD versioning macro

fhennig commented 9 months ago

This tracks the implementation of a first working version of the #[versioned] macro. Further improvements will be done as part of our normal development workflow as well as alongside #642.

General

Currently, all our CRDs are defined by Rust structs. This works really well for multiple reasons:

Structs can directly be used in the operator code
Can be serialized in YAML for for use in Helm or other deployments mechanisms
Deserialize directly from Kubernetes API calls

This strict type safety ensures we build code which will never crash during runtime due to invalid types. The future introduction of CRD versions warrants the same strict typing for different version of the CRD. Additionally, we want to be able to upgrade (and downgrade) between versions in a controlled and automated manner. Rust already provides traits for conversion of types: From<T>, TryFrom<T> and friends. A goal of this implementation should be support for this native mechanism.

Requirements

Common mechanisms for CRD versioning must live in a common repository
Different versions of the same struct without much overhead. This overhead includes:
- The need to copy and paste the same struct for multiple versions
- Manually keeping track of changes between versions and how to properly convert between them
- Conversion should be a simple one method call, no need to manually assign fields from the old struct to the new one
Support for versioned field changes:
- Add a new field
- Rename a field
- Deprecate a field
Field changes can be annotated with notes, hints, etc...
The derive macro must surface meaningful error messages
Handle recursive structs and enums

Vision

Initial Sketch

### Adding the Derive Macro The derive macro uses a simple and fitting name. A few examples are: `Versioned`, `Version` and `Update`. The macro additionally uses an attribute to customize the code generation. The name of that attribute will be the same as the derive macro name, just all lowercase letters. ```rust // The derive macro is used just like any other derive macro like Debug, // Deserialize or JsonSchema #[derive(Debug, Versioned)] pub struct MyCrd { foo: String, bar: usize, } // The macro can also be used on enums #[derive(Debug, Versioned)] pub enum MyEnum { Foo, Bar, } ``` The macro provides a mechanism to provide traits which should be derived on the generated structs/enums. Attributes like `#[serde]` or `#[schemars]` will be passed through as well. ```rust #[derive(Debug, Versioned)] #[versioned(derives(Debug, PartialEq))] #[serde(rename_all = "camelCase")] pub struct MyCrd { foo: String, } ``` ### Adding Versions Each distinct version of a struct/enum needs to be declared via the accommodating attribute. The `version` field can be used as often as one would like. Each definition creates a new versioned version of the base struct/enum. Each version name must be unique, enforced by the macro and surfaced with easy to understand error message. *(Optional)* Because we plan to publish this crate publicly for the Rust community, we don't mandate the use of Kubernetes versions as version names. Instead, the validation should be as generic as possible to allow usages outside of the context of Kubernetes. Initially we might want to only support SemVer and Kubernetes version formats. Each version must provide an implementation for `PartialEq` and `PartialOrd`. *(Optional)* Users can customize the visibility of the generated structs. ```rust #[derive(Debug, Versioned)] // Default visibility is `pub` #[versioned( version(name = "v1alpha1"), version(name = "v1beta1", visibility = "pub(crate)") )] pub struct MyCrd { foo: String, bar: usize, } #[versioned( version(name = "v1alpha1"), version(name = "v1beta1", visibility = "pub(crate)") )] #[derive(Debug, Versioned)] pub enum MyEnum { Foo, Bar, } ``` Each added version will generate a new struct or enum specific for that version. A basic example without any fields/variants (and semantics around them) could look like this: ```rust pub mod v1alpha1 { pub struct MyCrd { // Fields will be discussed in sections below } } pub(crate) mod v1beta1 { pub struct MyCrd { // Fields will be discussed in sections below } } // ################################################ pub mod v1alpha1 { pub enum MyEnum { // Variants will be discussed in sections below } } pub(crate) mod v1beta1 { pub enum MyEnum { // Variants will be discussed in sections below } } ``` The naming scheme uses Rust modules. Different version of the struct would then be separated by modules, like `v1alpha1::MyCrd` and `v1beta1::MyCrd`. Additionally, the macro provides stable modules, which for example always point to the latest version of the struct/enum. ```rust pub mod v1alpha1 { pub struct MyCrd; } pub mod v1beta1 { pub struct MyCrd; } // The operator can always use latest::MyCrd pub mod latest { pub use v1beta1::*; } ``` #### Deprecation of Versions CRDs can mark particular versions as [deprecated](https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definition-versioning/#version-deprecation). This can be achieved by providing the `deprecated` flag. Versioned structs will additionally include Rust's `#[deprecated]` attribute. ```rust #[derive(Debug, Versioned)] #[versioned( version(name = "v1alpha1", deprecated), version(name = "v1beta1") )] pub struct MyCrd { foo: String, bar: usize, } ``` The resulting generated code would look similar to this: ```rust pub mod v1alpha1 { #[deprecated] pub struct MyCrd { // Fields will be discussed in sections below } } pub mod v1beta1 { pub struct MyCrd { // Fields will be discussed in sections below } } ``` ### Field Attributes Each field can be attached with additional (but optional) attributes. These attributes indicate when the fields were added, renamed, deprecated and removed. Any field which doesn't have a attribute attached is presumed to exist since the earliest version (version `v1alpha1` in the example below). This feature is one reason why versions need to implement `PartialEq` and `PartialOrd`. Any attribute field cannot be combined with any other field if the same version is used. #### Add Fields This attribute marks fields as added in a specific version. This field will not appear in generated structs of earlier versions. ```rust #[derive(Debug, Versioned)] #[versioned( version(name = "v1alpha1"), version(name = "v1beta1") )] pub struct MyCrd { #[versioned(added(since = "v1beta1"))] foo: String, bar: usize, } ``` --- The resulting generated code would look similar to this: ```rust pub mod v1alpha1 { pub struct MyCrd { bar: usize, } } pub mod v1beta1 { pub struct MyCrdV1Beta1 { foo: String, bar: usize, } } ``` Adding required fields in CRDs is against the Kubernetes guidelines. Instead, CRD developers should use optional fields with a sane default value. There will be multiple pieces of Rust code to make this work: - Each field type needs to implement `Default`. - A custom default value can be provided using `#[versioned(added(default = default_fn))]`. This value will be used when using the `From` trait for conversion between versions. --- The `From for New` implementation will look similar to this: ##### Without Custom Default Value ```rust impl From for v1beta1::MyCrd { fn from(my_crd: v1alpha1::MyCrd) -> Self { Self { foo: Default::default(), bar: my_crd.bar, } } } ``` ##### With Custom Default Value ```rust impl From for v1beta1::MyCrd { fn from(my_crd: v1alpha1::MyCrd) -> Self { Self { foo: default_fn(), bar: my_crd.bar, } } } ``` #### Rename Fields This attribute marks fields as renamed in a specific version. This field will use the original name up to the version it was renamed in. The `From` trait implementation will move the value from the old field to the new field. ```rust #[derive(Debug, Versioned)] #[versioned( version(name = "v1alpha1"), version(name = "v1beta1") )] pub struct MyCrd { #[versioned(renamed(since = "v1beta1", to = "baz"))] foo: String, bar: usize, } ``` --- The resulting generated code would look similar to this: ```rust pub mod v1alpha1 { pub struct MyCrd { foo: String, bar: usize, } } pub mod v1beta1 { pub struct MyCrd { baz: String, bar: usize, } } ``` --- The `From for New` implementation will look similar to this: ```rust impl From for v1beta1::MyCrd { fn from(my_crd: v1alpha1::MyCrd) -> Self { Self { baz: my_crd.foo, bar: my_crd.bar, } } } ``` #### Deprecate Fields This attribute marks fields as deprecated in a specific version. Each deprecated field will be included in later versions but renamed to `deprecated_`. They will additionally include Rust's `#[deprecated]` attribute to indicate the deprecation in the Rust code. Conversions from earlier versions will move the value of the field to the deprecated field automatically. ```rust #[derive(Debug, Versioned)] #[versioned( version(name = "v1alpha1"), version(name = "v1beta1") )] pub struct MyCrd { #[versioned(deprecated(since = "v1beta1"))] foo: String, bar: usize, } ``` --- The resulting generated code would look similar to this: ```rust pub mod v1alpha1 { pub struct MyCrd { foo: String, bar: usize, } } pub mod v1beta1 { pub struct MyCrd { #[deprecated] deprecated_foo: String, bar: usize, } } ``` --- The `From for New` implementation will look similar to this: ```rust impl From for v1beta1::MyCrd { fn from(my_crd: v1alpha1::MyCrd) -> Self { #[allow(deprecated)] Self { deprecated_foo: my_crd.foo, bar: my_crd.bar, } } } ``` ## Full Example ```rust #[derive(Debug, Versioned)] #[versioned( version(name = "v1alpha1"), version(name = "v1beta1") )] pub struct MyCrd { #[versioned(added(since = "v1beta1"))] foo: String, #[versioned(renamed(since = "v1beta1", to = "new_bar"))] bar: usize, #[versioned(deprecated(since = "v1beta1"))] baz: bool, gau: usize, } pub mod v1alpha1 { pub struct MyCrd { bar: usize, baz: bool, gau: usize, } } pub mod v1beta1 { pub struct MyCrd { foo: Option, new_bar: usize, deprecated_baz: bool, gau: usize, } } impl From for v1beta1::MyCrd { fn from(my_crd: v1alpha1::MyCrd) -> Self { #[allow(deprecated)] Self { foo: Default::default(), new_bar: my_crd.bar, deprecated_baz: my_crd.baz, gau: my_crd.gau, } } } impl From for v1alpha1::MyCrd { fn from(my_crd: v1beta1::MyCrd) -> Self { #[allow(deprecated)] Self { bar: my_crd.new_bar, baz: my_crd.deprecated_baz, gau: my_crd.gau, } } } ```

Kubernetes API Version Crate

Initial Sketch

We want to parse raw Kubernetes (API) version strings into Rust structs. This enables us to more easily work with these versions. It enables sorting, equality checking and ordering. It additionally enables us to provide better error messages in case of invalid versioning. ```rust // (/) pub struct ApiVersion { pub group: Option, pub version: Version, } // v() pub struct Version { pub major: u64, pub level: Option, } // beta/alpha pub enum Level { Beta(u64), Alpha(u64), } ``` A PoC implementation for this already exists.

This is implemented. See https://github.com/stackabletech/operator-rs/tree/main/crates/k8s-version

Open Questions / Ideas

We may even want to limit the visibility of deprecated fields using pub(crate).
We might want to make the conversion mechanism configurable for added fields via the builder pattern. This can be done by implementing the From<Old> for New manually instead of using the automatically derived one. Users can opt out of the auto-generation by providing the #[versioned(version(name = "v1beta1", skip("from")))]

References

Selection of Crates

Research

## Tasks
- [x] Our CRDs are already very big, how do we handle multiple versions (which basically double or triple the size) in terms of space requirements?

Implementation

## Tasks
- [ ] https://github.com/stackabletech/operator-rs/pull/764
- [ ] https://github.com/stackabletech/operator-rs/pull/784
- [ ] https://github.com/stackabletech/operator-rs/pull/793
- [ ] https://github.com/stackabletech/operator-rs/pull/790
- [ ] https://github.com/stackabletech/operator-rs/pull/804
- [ ] https://github.com/stackabletech/operator-rs/pull/813
- [ ] https://github.com/stackabletech/operator-rs/pull/820
- [ ] https://github.com/stackabletech/operator-rs/pull/844
- [x] Add support for nested versioned data structures
- [ ] https://github.com/stackabletech/operator-rs/pull/847
- [ ] https://github.com/stackabletech/operator-rs/pull/857
- [ ] https://github.com/stackabletech/operator-rs/pull/849
- [ ] https://github.com/stackabletech/operator-rs/pull/850
- [ ] https://github.com/stackabletech/operator-rs/pull/865
- [x] Add support for `serde` integration
- [ ] https://github.com/stackabletech/operator-rs/pull/876

Test Rounds

2024-09-06

https://github.com/stackabletech/issues/issues/507#issuecomment-2333839551

### Fixes
- [ ] https://github.com/stackabletech/operator-rs/pull/859
- [ ] https://github.com/stackabletech/operator-rs/pull/860
- [ ] https://github.com/stackabletech/operator-rs/pull/864

2024-09-18

### Fixes
- [ ] https://github.com/stackabletech/operator-rs/pull/872
- [ ] https://github.com/stackabletech/operator-rs/pull/875
- [ ] https://github.com/stackabletech/operator-rs/pull/873
- [x] ~Handle URL replacements in CRDs (currently done)~ (Moved)
- [x] ~Utility methods for printing crd to stdout, etc... (see that trait impl)~ (Moved)

Acceptance criteria:

We know how to version our CRDs in Rust
We know how to write our conversion code between these versions
We have a PoC implementation/spike for a conversion webhook
We have a full-featured derive macro to handle CRD versioning
We have a general-purpose crate which can be used by others in the open-source Rust ecosystem

fhennig commented 9 months ago

related: this ticket (https://github.com/stackabletech/issues/issues/504) tracks a bunch of breaking changes that we want to do when we have CRD versioning in place

soenkeliebau commented 8 months ago

Love the idea expressed in the description!!!

Just one question, will there be a way to express transformation logic? For example consider this scenario (not saying this specific one makes sense, but for arguments sake..):

v1 has a field timeout_seconds and v2 deprecates this field and introduces a new field timeout_milliseconds with a default value of timeout_seconds * 1000. Can we somehow allow specifying a conversion function to be called in the derive macro (can be at a much later stage, just wanted to consider it).

Techassi commented 8 months ago

Can we somehow allow specifying a conversion function to be called in the derive macro (can be at a much later stage, just wanted to consider it).

I have some ideas for that in mind indeed. I will add a few more details on that! But I agree, this will probably be added at a later point in time.

Also see the second point in the Questions / Ideas section.

v1 has a field timeout_seconds and v2 deprecates this field and introduces a new field timeout_milliseconds with a default value of timeout_seconds * 1000.

Ideally we want to use Duration which supports human-readable duration strings. When using this type, fields are just named timeout. The field name itself doesn't carry any information about time granularity.

Techassi commented 8 months ago

I updated the ticket and added "Idea 2" which dives into the customization of the conversion process.

soenkeliebau commented 8 months ago

Disclaimer: Just to write it down somewhere, no need to respond, we can discuss next week.

How do we want to handle actual code that is currently in the crd module, haven't gone and checked, but I think there is a not insignificant amount of fn's in the crd modules that is needed for the operator to do its work.

Will the operators internally always work with the latest version and all actual code is moved to that module any time a new version is added?

adwk67 commented 8 months ago

This is a very thorough overview - great work! Two things occurred to me:

in view of

Adding required fields in CRDs is against the Kubernetes guidelines. Instead, CRD developers should use optional fields with a sane default value

will we plan to move new optional fields out into being non-optional in a subsequent update, so that foo: Option<String> can eventually become foo: String? (if it implements Default that shouldn't present a problem?)

how will we deal with versioning of fields in the operator framework? Will we use a similar approach in operator-rs?

maltesander commented 8 months ago

How do we want to handle actual code that is currently in the crd module, haven't gone and checked, but I think there is a not insignificant amount of fn's in the crd modules that is needed for the operator to do its work.

Unfortunately yes, i think we have to traitify the structs and each version of the struct will implement the trait in its own module or something.

Will the operators internally always work with the latest version and all actual code is moved to that module any time a new version is added?

I think the operators will work with the latest version, just keeping the old stuff for crd generation i guess.

how will we deal with versioning of fields in the operator framework? Will we use a similar approach in operator-rs?

Yeah everything that appears in the CRD must be versioned somehow. Not a big deal for standalone CRDs like AuthClass with their own version etc. but e.g. a change in the Role struct will result in a CRD version bump of the implementing operators.

soenkeliebau commented 8 months ago

I think the operators will work with the latest version, just keeping the old stuff for crd generation i guess.

If the operator always gets the latest version, couldn't we.in theory noop all operations in older versions, as they shouldn't ever be called?

maltesander commented 8 months ago

If the operator always gets the latest version, couldn't we.in theory noop all operations in older versions, as they shouldn't ever be called?

Actually yeah, we will only need the "implementation" for the version (whatever it will be) the operator is requesting. Then its purely kept for crd generation purposes.

stefanigel commented 8 months ago

Hi, I read a lot of details, HOW CRD Versioning can be implemented, but nod description WHAT value the feature delivers. What are the requirements? Which issues can be solved? (Ok, it versions CRDs, but it seems to be more than adding a version number ...) Probably you all know this, but others may not. How would you explain an interested user, what this feature does for her?

NickLarsenNZ commented 5 months ago

Just regarding the modules being the "version", and containing the originally named structs inside it, eg:

pub mod v1alpha1 {
    pub struct MyCrd {
        bar: usize,
        baz: bool,
        gau: usize,
    }
}

I think that should be inverted, like:

pub mod my_crd {
    pub struct V1Alpha1 {
        bar: usize,
        baz: bool,
        gau: usize,
    }
}

I can't think of good wording, but here it is raw:

A v1alpha1 of a Foo has no relation to a v1alpha1 of a Bar.
A container Foo has a set of versions. A version does not have a set of containers (eg: Foo and Bar).

Techassi commented 2 months ago

Results of a first (and quick) test round:

Version modules don't have access to imported (at the file root) items.
There are two version: v1 and v2. If a field changes the type in v2 and there is no action in v1, the inferred type in v1 is wrong.
The #[versioned()] macro needs to appear before other attributes / macros. Mention this in the docs.

Fixes for these issues are tracked in the tasklists above.

Techassi commented 1 month ago

Closing this tracking issue as the initial implementation is finished. Tracking of CRD versioning moved to #642.

stackabletech / issues