stackabletech / documentation

Stackable's central documentation repository built on Antora
https://docs.stackable.tech
Apache License 2.0
11 stars 11 forks source link

ADR: CRD Versioning #450

Open fhennig opened 10 months ago

fhennig commented 10 months ago

Below are some prelimiary thoughts on the topic of CRD versioning, mostly taken from the on-site meeting on this topic. For an ADR, my suggestion is to gather requirements about how it should work from this Kubernetes docs page: Versions in CustomResourceDefinitions - Kubernetes Documentation. It looks like a comprehensive guide on how the versioning works.

Conversion webhooks & up- and downgrading

mutating webhooks are the core mechanism required for versioning CRDs. If an old resource is applied by the user, the webhhook will convert it into the current version. Likewise, if the users requests and older version (arbirary versions can be requested) then the webhook is also used for conversion. (Note from felix: Does that mean that two way conversion is absolutely a thing that needs to be implemented?)

When you read an object, you specify the version as part of the path. You can request an object at any version that is currently served. If you specify a version that is different from the object's stored version, Kubernetes returns the object to you at the version you requested, but the stored object is not changed on disk.

We cannot remove (mandatory) fields, because their content will be required when downgrading. This means we have to rename the fields (i.e. deprecated_oldField).

Only support upgrades for now (no downgrading) Do not skip releases - always upgrade only one version up (v1 -> v2 -> v3 not v1 -> v3)

CRD size

CRD size is a problem: etcd and the kube API both have limits on how large objects are allowed to be.

copy & paste - To have multiple versions in our CRD, we need to keep the old rust struct around. this means copy pasting the rust struct for each verison. Not ideal ....

stabilize CRDs first to reduce conversion efforts? Would be nice beause it saves a lot of work. But CRD versioning is important now.

ADR thoughts

nightkr commented 10 months ago

mutating webhooks are the core mechanism required for versioning CRDs.

Conversion webhooks are technically a distinct thing, but yes the idea still applies.

If an old resource is applied by the user, the webhhook will convert it into the current version.

It's also used if the user requests a different version than what is currently stored in etcd.

We cannot remove (mandatory) fields, because their content will be required when downgrading. This means we have to rename the fields (i.e. deprecated_oldField).

Not just mandatory fields, all fields (that are still used) must survive arbitrary roundtripping (for example: object is currently v2, user GETs v1 (conversion v2->v1), changes something, then REPLACEs it (conversion v1->v2), no data must be lost here).

Depending on the use case, we could also inject bogus placeholders when downgrading.

CRD size - it's a problem ... why exactly?

etcd and the kube API both have limits on how large objects are allowed to be.

Only support upgrades for now (no downgrading) Do not skip releases - always upgrade only one version up (v1 -> v2 -> v3 not v1 -> v3)

AFAIK this decision was made about the operators themselves, not the API versioning?

fhennig commented 10 months ago

Thanks for the comments, I updated the ticket