kube-rs / kopium

Kubernetes OPenapI UnMangler
Apache License 2.0
113 stars 21 forks source link

Optional array properties should be `Option<Vec<T>>`, not `Vec<T>` with `skip_serializing_if = "Vec::is_empty"` #40

Closed alex-hunt-materialize closed 2 years ago

alex-hunt-materialize commented 2 years ago

Currently, optional array properties are required in the generated struct, but skipped if serializing. This is incorrect behavior, since an empty Vec is not the same as not supplying a Vec.

This breaks server-side apply-ing new resources to remove things from the Vec, as well as cases where a property is required in a one-of, but you actually want to send the empty Vec as one of those properties.

We should instead make it an Option<Vec<T>> with #[serde(default, skip_serializing_if = "Option::is_none")].

Example:

              podSelector:
                description: Selects pods in the same namespace.
                oneOf:
                - required:
                  - matchExpressions
                - required:
                  - matchLabels
                properties:
                  matchExpressions:
                    items:
                      properties:
                        key:
                          type: string
                        operator:
                          enum:
                          - In
                          - NotIn
                          - Exists
                          - DoesNotExist
                          type: string
                        values:
                          items:
                            type: string
                          type: array
                      required:
                      - key
                      - operator
                      type: object
                    type: array
                  matchLabels:
                    type: object
                    x-kubernetes-preserve-unknown-fields: true
                type: object

This generates:

/// Selects pods in the same namespace.
#[derive(Serialize, Deserialize, Clone, Debug)]
pub struct ServerPodSelector {
    #[serde(default, skip_serializing_if = "Vec::is_empty")]
    pub matchExpressions: Vec<ServerPodSelectorMatchExpressions>,
    pub matchLabels: Option<ServerPodSelectorMatchLabels>,
}

#[derive(Serialize, Deserialize, Clone, Debug)]
pub struct ServerPodSelectorMatchExpressions {
    pub key: String,
    pub operator: String,
    #[serde(default, skip_serializing_if = "Vec::is_empty")]
    pub values: Vec<String>,
}

#[derive(Serialize, Deserialize, Clone, Debug)]
pub struct ServerPodSelectorMatchLabels {
}

Complete CRD https://gist.github.com/alex-hunt-materialize/2743b1e2e58a49c4df0a11ecb39f46ab

What we should get:

/// Selects pods in the same namespace.
#[derive(Serialize, Deserialize, Clone, Debug, Default)]
pub struct ServerPodSelector {
    #[serde(default, skip_serializing_if = "Option::is_none")]
    pub matchExpressions: Option<Vec<ServerPodSelectorMatchExpressions>>,
    pub matchLabels: Option<ServerPodSelectorMatchLabels>,
}

#[derive(Serialize, Deserialize, Clone, Debug)]
pub struct ServerPodSelectorMatchExpressions {
    pub key: String,
    pub operator: String,
    #[serde(default, skip_serializing_if = "Option::is_none")]
    pub values: Option<Vec<String>>,
}

#[derive(Serialize, Deserialize, Clone, Debug)]
pub struct ServerPodSelectorMatchLabels {
}
clux commented 2 years ago

Thanks a lot for all of these. Great bugs.

This is probably the most annoying one of the bunch. I must have hoped we could rely on some kind of omitempty equivalent to look for here to avoid option wrapping :(

I will be a bit busy to look at these atm, but can help guide if you have time to do a PR. If not, dw, thanks for raising these, they will eventually get fixed.

alex-hunt-materialize commented 2 years ago

Wow, thanks for the quick response and for writing this awesome tool! Even though I have to hand-edit the generated code, it's still much easier than starting from scratch.

I unfortunately don't see any way around wrapping with Option, as sending the empty object is required for correctness. In my example, I have to have one of those properties, and even if matchLabels had been handled as a BTreeMap<String, String>, that would still have this same skipped serializing behavior.

I'm going to take a look, but I've not done much of this kind of code gen before, so no idea how far I'll get. I guess this is just removing the first two cases of https://github.com/kube-rs/kopium/blob/2e8bef3b88040b356714fc15173b1b477816df4e/src/analyzer.rs#L215-L238 ? I think we can also add #[serde(default, skip_serializing_if = "Option::is_none")] to the third case, which should prevent us from taking ownership of the fields when doing server-side apply.

alex-hunt-materialize commented 2 years ago

That... actually seems to work!

use kube::CustomResource;
use serde::{Serialize, Deserialize};

#[derive(CustomResource, Serialize, Deserialize, Clone, Debug, Default)]
#[kube(group = "policy.linkerd.io", version = "v1beta1", kind = "Server", plural = "servers")]
#[kube(namespaced)]
#[kube(schema = "disabled")]
pub struct ServerSpec {
    /// Selects pods in the same namespace.
    pub podSelector: ServerPodSelector,
    /// A port name or number. Must exist in a pod spec.
    pub port: String,
    /// Configures protocol discovery for inbound connections.
    /// Supersedes the `config.linkerd.io/opaque-ports` annotation.
    #[serde(default, skip_serializing_if = "Option::is_none")]
    pub proxyProtocol: Option<String>,
}

/// Selects pods in the same namespace.
#[derive(Serialize, Deserialize, Clone, Debug)]
pub struct ServerPodSelector {
    #[serde(default, skip_serializing_if = "Option::is_none")]
    pub matchExpressions: Option<Vec<ServerPodSelectorMatchExpressions>>,
    #[serde(default, skip_serializing_if = "Option::is_none")]
    pub matchLabels: Option<ServerPodSelectorMatchLabels>,
}

#[derive(Serialize, Deserialize, Clone, Debug)]
pub struct ServerPodSelectorMatchExpressions {
    pub key: String,
    pub operator: String,
    #[serde(default, skip_serializing_if = "Option::is_none")]
    pub values: Option<Vec<String>>,
}

#[derive(Serialize, Deserialize, Clone, Debug)]
pub struct ServerPodSelectorMatchLabels {
}

I'm not sure how to run the tests, though, as I don't have that prometheusrules.monitoring.coreos.com CRD in my cluster. Do you happen to have a link to the json or yaml for that CRD?

alex-hunt-materialize commented 2 years ago

https://github.com/kube-rs/kopium/pull/41

~I've got it as a draft, since the README.md should be updated and I should run the tests, but I don't have that CRD to do so.~ Found it, it's in the prometheus-community/kube-prometheus-stack helm chart.

clux commented 2 years ago

Thanks again. Forgot we had already special cased logic to remove the optionalisation there, so that made this an easy change :-)

Btw for tests we also provide better commands for running tests (and grabbing crds) in the justfile.

clux commented 2 years ago

One point here that's worth re-considering here actually; Not being able to use server-side apply without options is only an apply-params concern; https://github.com/kube-rs/kube-rs/issues/649 .

I.e. it's actually fine to not option wrap the types if users don't use need to use apply on the structs:

There is some value in forcing people to use mutating operations with small subset bodies; thinking about merge patches, or using json! is likely to lead to less confusing serialization behaviour. Particularly as this stuff happens with k8s-openapi structs (there are fields you cannot avoid serializing like replica counts which are non-optional ints). The non-optional values are why the go world made a whole parallel struct setup to deal with this api wart.

But on the other hand it's also annoying that we cannot use the full structs with apply in more general settings.

I'm wondering if it's worth adding a flag here to get the non-option wrapped behaviour back. Until we have full coverage for apply-configurations in rust, I'm not sure I want to fully advocate for the "always use apply" without some caveats.

alex-hunt-materialize commented 2 years ago

I expect the primary use case for these generated types is in custom controllers which run unattended. In almost all those cases, I want server-side apply. After having written three operators, I have not found a single case where I preferred json patches or merge patches over server-side apply. In fact, I started out trying both of those first, and hit many unexpected behaviors.

If the concern is that it's onerous for the user to have to wrap things in Some(), perhaps generating builder pattern methods would alleviate the pain.

users might can still use merge patches or json patches for unsetting things

This introduces additional logic to decide which patch type to use, and for generating these less-safe patch types, since in the majority of cases, the user wants server-side apply.

users can still use apply with data as serde_json::json!({"spec": { "thevec": [] } }) to null out empty vectors

Does this actually delete the vector? I would expect it to set it to an empty vector (ie: empty the vector, but it still be there).

clux commented 2 years ago

I would expect it to set it to an empty vector (ie: empty the vector, but it still be there)

Yeah, it would only clear it. You would need to do serde_json::json!({"spec": { "thevec": None } } to force serialize a null to delete it. The problem with option-less structs is that you lose the ability to do that.

This introduces additional logic to decide which patch type to use, and for generating these less-safe patch types, since in the majority of cases, the user wants server-side apply.

Yeah, the patches are more cumbersome, but they can do things that requires more finesse than an apply can without a GET, eval, UPDATE cycle (such as deleting a specific element from a map).

I think you are right though. The optionals aren't that cumbersome, we aren't preventing any smart patch uses cases by generating them, and controllers are the main use case. Was considering adding a kopium flag to not option-wrap for non-controllers, but I think that's probably a mistake upon reflection; it'll make it more likely people run into the serialization problems if they do start to need it, and that's a much worse UX.