Arnavion / k8s-openapi

Rust definitions of the resource types in the Kubernetes client API
Apache License 2.0
379 stars 41 forks source link

Relax the requirement of the `Resource` trait implementation #81

Closed MOZGIII closed 3 years ago

MOZGIII commented 3 years ago

A few words about the use case first: we are building a kubernetes_metadata transform at Vector, and we realized we want to support operation on arbitrary Kubernetes resources.

We want to build an AnyResource type, that would fit in with the rest of our system to interop with Kubernetes, which is currently built using k8s-openapi crate. This type would take an arbitrary serde_json::Value::Object, and implement some functionality we need to help up wire it through our reflector.

One of the issues we've encountered is that the k8s_openapi::WatchResponse requires the wrapped type to implement a k8s_openapi::Resource, which, in turn, requires us to know the kind of resource we're working with at compile time - as it has a number of const values in it.

Our goal is to allow operating on any resource - even the ones we are not aware of at the compile time. This will allow our users to configure Vector to work with things that potentially only exist on their cluster, like custom resources and etc.


I've thought of a couple of solutions to this problem:

  1. Remove the requirement to implement k8s_openapi::Resource from the type that k8s_openapi::WatchResponse and k8s_openapi::apimachinery::pkg::apis::meta::v1::WatchEvent operate on. This seems like a fairly trivial thing to do - nowhere in the implementation the consts are actually used.

  2. Remove the consts from the k8s_openapi::Resource, allowing for it to be implemented by something like our AnyResource. If this path is picked, another trait (let's call it k8s_openapi::ResourceMetadata for now) can be added. It would contain the consts that previously the k8s_openapi::Resource previously contained, and an implementation would be generated for the types just like with k8s_openapi::Resource right now. This new trait won't be added as a required type anywhere where it's not necessary for the underlying code. It is possible to also require trait ResourceMetadata: Resource (ResourceMetadata implementor to also implement Resource), but it doesn't seem like a really necessary thing - so it's a kind of an open question.

Technically, both solutions can even be implemented at once, as each of them gives some extra flexibility for the cases like we're facing. The (2) changes the semantics of the k8s_openapi::Resource into a kind of a market-trait that simply helps to guide the users on what types they're supposed to use where. We could benefit from using this notion in our code too. The (1) solution simply eliminates the requirements, allowing users to pass anything anywhere. This allows for maximum flexibility.

clux commented 3 years ago

Is 2 even possible? You can only have associated constants in a trait, not dynamic data members. AnyResource seems unimplementable.

Note that kube gets around this setup with DynamicResource, and having two ways to create its wrapper Resource, one from the associated consts, another via dynamic data.

MOZGIII commented 3 years ago

In the (2), we won't be implementing k8s_openapi::ResourceMetadata for AnyResource, but we will be implementing k8s_openapi::Resource - so no consts are necessary.

We aren't really using the consts from k8s_openapi::Resource yet really, so we have no need for it to be a type.

We just want the k8s_openapi::WatchResponse<AnyResource> to work.

Arnavion commented 3 years ago
  1. Remove the requirement to implement k8s_openapi::Resource from the type that k8s_openapi::WatchResponse and k8s_openapi::apimachinery::pkg::apis::meta::v1::WatchEvent operate on. This seems like a fairly trivial thing to do - nowhere in the implementation the consts are actually used.

Yes they are.

MOZGIII commented 3 years ago
  1. Remove the requirement to implement k8s_openapi::Resource from the type that k8s_openapi::WatchResponse and k8s_openapi::apimachinery::pkg::apis::meta::v1::WatchEvent operate on. This seems like a fairly trivial thing to do - nowhere in the implementation the consts are actually used.

Yes they are.

Oh, I must've missed it. I've looked only along the code path I was about to use.

By the way, to give a more concrete idea, here's what I've implemented so far: https://github.com/timberio/vector/blob/8ea0cef30cefa2961acad05be43be9809ac9a9ac/src/kubernetes/any_resource.rs

Arnavion commented 3 years ago

https://github.com/Arnavion/k8s-openapi/commit/f506c01134b5280faaa1d003c7445178df7fa36e

MOZGIII commented 3 years ago

f506c01

Ok, that makes sense!

I wonder why this validation is necessary though. Are there cases when the data sent to us from the API does not match the requested resource URL?

In our case, we'll also have a user-configurable request builder, from which we can take the resource API group, version and plural kind. If the Resource trait had getters rather than consts - I think I could find a way to pass the parameters from the request builder to the resource through to the AnyResource somehow.

So, without knowing the answer to the question above, I've figured the following solutions:

  1. Switch Resource trait from using consts to something else.
  2. Do not validate the bookmarks at all, remove the Resource trait requirement.
  3. Introduce another trait specific to bookmark validation, allow a custom implementation of it, and swap the Resource trait requirement with that trait requirement.
  4. Allow generalizing the bookmark type itself - similar to (3), but on a different level, but should work just as good.
Arnavion commented 3 years ago

It's to prevent accidentally parsing a response with the wrong Rust type.

The golang client solves this with its typemapper stuff, and other openapi-generated clients don't solve it at all and just have api_version and group fields and leave it to you to figure out it's either the wrong type. Of course it's unlikely to have a type mismatch with those clients in the first place, because they abstract the HTTP operation.

But with this library the request function and response type are only loosely-related, because of the sans-io approach; the request function suggests the response type via the ResponseBody return, but it's not mandatory to use that particular ResponseBody, or even any ResponseBody at all (such as if you have your own buffers). If the API version and group weren't checked, parsing with the wrong Rust type would usually succeed, because so many resource types have all their fields as optional.

Arnavion commented 3 years ago

In our case, we'll also have a user-configurable request builder, from which we can take the resource API group, version and plural kind. If the Resource trait had getters rather than consts - I think I could find a way to pass the parameters from the request builder to the resource through to the AnyResource somehow.

[...]

  1. Switch Resource trait from using consts to something else.

I don't understand your proposal. Making them non-const fns would amount to making them fn() -> &'static str which is still fine, but what you seem to want is fn(&'a self) -> &'a str (explicit lifetimes for clarity) because the api_version and group are being read from your AnyResource's fields before anything is deserialized, ie your AnyResource is acting like a DeserializeSeed.

Please start with this template:

// k8s-openapi

struct Pod {
    foo: i32,
}

enum WatchEvent<T> {
    Added(T),
    Bookmark { resource_version: String },
}

// ---

// Your crate

struct AnyResource {
    api_version: String,
    group: String,
    #[serde(flatten)]
    data: BTreeMap<String, serde_json::Value>,
}

and fill out how the three serde::Deserialize impls will look like with your proposal. Note that, just like the code is today, Pod does not have api_version and group fields of its own because they are evident from the type, and its Deserialize impl must validate them.

MOZGIII commented 3 years ago

Ideally I wouldn’t want to validate the group or kind of AnyResource at all, and blindly trust that whatever arrived is a resource of some kind. The whole point is removing the knowledge of what particular resource kind is used from the compile time - so it’s entirely driven by runtime (i.e. user) configuration. We know that what all resources have in common is that they are resresented with JSON, and that we permit users to fetch arbitrary fields from them. This is how we plan to use AnyResource - to simply provide an unified interface for a resource of any kind without any kind of compile-time specialization.

I’ll fill in the templates as requested tomorrow (or try at least), just wanted to make my use case clearer in the meantime.

I can tell already though that if I were to validate the version/group in the bookmarks, I’d have to wire the data differently. This might also dictate the differences it traits.

This is a good place to start!

MOZGIII commented 3 years ago

How about something like this: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=b8e4b64fa82e2b99b2c808b770b09f78 ?

The plan is to add the following to the k8s-openapi crate:

/// A trait that captures the functionality of a bookmark resource.
trait BookmarkResource: Sized {
    fn resource_version(&self) -> &str;
    fn into_resource_version(self) -> String;
}

/// A trait that binds an associated bookmark resource for any type it's
/// implemented for.
trait Bookmarkable {
    type BookmarkType: BookmarkResource;
}

/// A type that implements a `BookmarkResource` trait that will ensure that
/// the `T`'s `Resource` `API_VERSION` and `API_GROUP` match the provided data
/// at deserialization.
///
/// Since this type provides validation, we do need the `T: Resource` here.
struct CheckedResourceBookmark<T: Resource> {
    resource_type: std::marker::PhantomData<T>,
    resource_version: String,
}

impl<T: Resource> BookmarkResource for CheckedResourceBookmark<T> {
    fn resource_version(&self) -> &str {
        &self.resource_version
    }

    fn into_resource_version(self) -> String {
        self.resource_version
    }
}

impl<'de, T: Resource> Deserialize<'de> for CheckedResourceBookmark<T> {
    fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
    where
        D: serde::Deserializer<'de>
    {
        // This works as currently implemented - validates the `API_VERSION`
        // and `API_GROUP`, and etc.
        let value: BookmarkObject<'static, T> = serde::Deserialize::deserialize(deserializer)?;
        let resource_version = value.metadata.resource_version.into_owned();
        Ok(Self {
            resource_type: std::marker::PhantomData,
            resource_version,
        })
    }
}

// Make the `WatchEvent` take any bookmarkable type.
enum WatchEvent<T: Bookmarkable> {
    Added(T),
    Bookmark(<T as Bookmarkable>::BookmarkType),
}

In our crate:

struct AnyResource {
    #[serde(flatten)]
    data: serde_json::Map<String, serde_json::Value>,
}

struct AnyResourceBookmark {
    resource_version: String,
}

impl BookmarkResource for AnyResourceBookmark {
    fn resource_version(&self) -> &str {
        &self.resource_version
    }

    fn into_resource_version(self) -> String {
        self.resource_version
    }
}

impl<'de> Deserialize<'de> for AnyResourceBookmark {
    fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
    where
        D: serde::Deserializer<'de>
    {
        // Do not check the API_VERSION in here.
        // We'll implement it better, but this will do for now.
        let value: serde_json::Value = serde::Deserialize::deserialize(deserializer)?;
        let resource_version = value.get("metadata").and_then(|val| val.get("resource_version")).and_then(|val| val.as_str()).expect("oops").to_owned();
        Ok(Self { resource_version })
    }
}

impl Bookmarkable for AnyResource {
    type BookmarkType = AnyResourceBookmark;
}

With this, we'll gain the ability to pass arbitrary resources through the deserialization mechanism, while not losing any functionality that k8s-openpi provides for checking the bookmark resources.

The Deserialize impl for WatchEvent is omitted, but I think it's pretty obvious how that would work - we just delegate to the respective types that perform the deserialization.

So, more specifically, we'd need some trait bounds for Deserialize of WatchEvent:

impl<'de, T> Deserialize<'de> for WatchEvent<T>
where
    T:  Bookmarkable + serde::Deserialize<'de>,
    <T as Bookmarkable>:: BookmarkType: serde::Deserialize<'de>
{
    fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
    where
        D: serde::Deserializer<'de>
    {
        // ...
    }
}
Arnavion commented 3 years ago

Your playground appears to have left Resource unchanged and only allowed the API version + group validation of WatchEvent::Bookmark to be weakened. Was that your intention? Earlier you were saying you wanted to change the Resource trait, so I'm confused.

If all you really wanted was the validation of WatchEvent::Bookmark to be weakened and nothing else, for that I'm okay with removing the validation altogether. That is, Bookmark can just look for .meta.resourceVersion and nothing else.

MOZGIII commented 3 years ago

Your playground appears to have left Resource unchanged and only allowed the API version + group validation of WatchEvent::Bookmark to be weakened. Was that your intention? Earlier you were saying you wanted to change the Resource trait, so I'm confused.

All I want is for k8s_openapi::WatchResponse and k8s_openapi::apimachinery::pkg::apis::meta::v1::WatchEvent to not require a type with version-specific consts associated with it - so we can use our custom type like AnyResource there that doesn't imply that the API version of the resource are known at compile time.

So, I've figured a way to abstract the WatchEvent from direct notion of the Resource - and allow other type, not implementing the Resource to be used with it.

Resource trait itself does not have to changed with the approach I propose at the playground. It merely adds another indirection layer, specifically for the bookmark type (as it has to be special).

If all you really wanted was the validation of WatchEvent::Bookmark to be weakened and nothing else, for that I'm okay with removing the validation altogether. That is, Bookmark can just look for .meta.resourceVersion and nothing else.

That would work for us too! This way the Resource won't be required anymore, and we'll be able to drop it from the WatchEvent, and thus, we'll be able to pass our AnyResource type there.

Arnavion commented 3 years ago

Try https://github.com/Arnavion/k8s-openapi/compare/fix-81

MOZGIII commented 3 years ago

Yep, this seems to work. We don't have a full implementation/tests yet, but the type checks pass now.

MOZGIII commented 3 years ago

Sorry I didn't discover this earlier, but I've encountered some more issues:

https://github.com/Arnavion/k8s-openapi/blob/9cb4e570a3552faf503d64249ab0791c9251dfce/src/v1_19/watch_response.rs#L6 https://github.com/Arnavion/k8s-openapi/blob/9cb4e570a3552faf503d64249ab0791c9251dfce/src/v1_19/watch_response.rs#L12

These do require Resource still.

MOZGIII commented 3 years ago

Submitted a PR with a fix: https://github.com/Arnavion/k8s-openapi/pull/82