Open danrspencer opened 1 year ago
First, is there any resource type for which generateName
cannot be used? ie do we need the first two versions?
Second, the third and fourth versions still have name: Option<String>
and thus still has the same need for .unwrap()
ping as now. It seems to me that most common resource types support generateName
and thus are in these categories.
The real problem is that creating a resource might have null
for name
and reading a resource will never have a null
name, but the same spec type is used for both operations and thus the same ObjectMeta
type needs to support both possibilities.
For example, a more accurate representation would be to replace ObjectMeta::name
and ObjectMeta::generated_name
with a ObjectMeta::name
that is an enum:
enum ObjectMetaName {
Name { inner: String },
Generated { prefix: String },
}
struct ObjectMeta {
name: ObjectMetaName,
...
}
But this still has the issue as name: Option<String>
, namely the typesystem would require you to handle the case where the name
is ::Generated
when you read a resource, even though you know it can only be ::Name
.
Instead of an enum we could have separate types determined by a type parameter:
enum ObjectMetaNameCreate {
Name { inner: String },
Generated { prefix: String },
}
struct ObjectMetaNameRead {
inner: String,
}
struct PodSpec<N> {
metadata: ObjectMeta<N>,
}
struct ObjectMeta<N> {
name: N,
}
Then Pod::read
would only be implemented on Pod<ObjectMetaNameRead>
and Pod::create
would be implemented on both Pod<ObjectMetaNameCreate>
and Pod<ObjectMetaNameRead>
. But the downsides of this approach are that every resource type now needs to be parameterized by N
, which is also infectious to callers like kube
that want to work with resources generically.
I feel splitting ObjectMeta
based on what the "model should be" is going to be a dangerous venture. Sure, namespaces on cluster scoped resources sounds illogical, but there's also nothing stopping the go api from doing that internally based on how they have made ObjectMeta
internally (with all +optional
marked fields) - and when we have made assumptions like this in the past it has not gone well (e.g. .metadata
is generally set for all objects everywhere, but we had to stop assuming that for all endpoints when some (a subresource endpoint afaicr) decided not to pass it). Having variants would also make the ergonomics for kube
a lot worse (as pointed out).
On the kube
side of things, we have a trait to help avoid these unwraps; you can use ResourceExt::name_unchecked
if you know the name is always going to be there (like all returns from the apiserver except admission controllers where generateName can be issued), or if you just want to print no matter where the value came from (creation or response) we have ResourceExt::name_any
. I'd be happy to help think of some better analogues of that for namespace
getters as well if that is helpful (we can maybe take the new Scope
trait into account).
Some background maybe; we're currently having a debate in our team because I want to add:
pub trait NamespaceUnchecked {
fn namespace_unchecked(&self) -> String;
}
impl<K: kube::Resource<Scope = k8s_openapi::NamespaceResourceScope>> NamespaceUnchecked for K {
fn namespace_unchecked(&self) -> String {
#[allow(clippy::expect_used)]
self.meta()
.namespace
.clone()
.expect(".metadata.namespace missing")
}
}
While other team members want to go the other way and remove all references to name_unchecked
and replace it with option unwrapping and conversion into a Result
.
Obviously I'm largely on the side of "just use the unchecked stuff, it will never error"; but it is slightly nasty to have code that potentially could (it won't, be it technically could) throw an exception littered around the code base. It feels like it sort of breaks that contract of safety that Rust usually offers.
It looks like name_any
may be a better fit than name_unchecked
throughout our codebase, but we still have the similar problem of namespace_unchecked
. 🤔
It feels like it sort of breaks that contract of safety that Rust usually offers.
yeah, for sure this is awkward.
We could maybe have a Namespace
newtype somewhere with a TryFrom
impl from K: Resource<Scope...>
(with an explicit error type that indicates that this is probably an apiserver failure when it's from a remote object) and then you can just put a ?
in there in the hopes of catching an apiserver bug in the future.
Ideally I'd prefer not to have to handle a result once I have an object from the API, I know it has a name
and (if applicable) and namespace
. Having to handle a case that will never happen, but the type system says can happen, definitely adds a complexity tax to the entire codebase.
As @Arnavion states the real problem is that the types you put into Kubernetes is subtly different to the type you get out of Kubernetes ... so maybe they really are just different types? As an alternative to making every type take a generic, would it be horrible to simply have two versions of every type?
e.g.
// Not convinced by the name `create` maybe `pending` or `pre`?
mod create {
struct Pod {
metadata: ObjectMetaCreate,
...
}
mod read {
struct Pod {
metadata: ObjectMetaRead,
...
}
}
impl TryFrom<crate::Pod> for read::Pod {
...
}
This adds a complexity to code generation, and to importing, but then accurately represents the type model from Kubernetes. This could also potentially not export both types by default and hide the create::
structs behind a feature flag. I imagine (though I could be completely wrong 😅) the use case of generated_name
is far rarer than just name
and most people would be happy to just have name
be mandatory.
Or on the idea of an enum type for name
that better represents it, this feels very like an EitherOrBoth type. It can have a name
, generated_name
or both; but not nothing. name()
would then always be able to return a valid value.
I must admit on the namespace
side I'm not fully following the motivation to keep that optional on namespaced resources; especially as these resources now have the Scope
so there's already some handling of namespaced vs cluster going on?
motivation to keep that optional on namespaced resources
well, it's really just apiserver bugs you are catering for. you can unwrap, or you can propagate the error to whatever upstream thing you have via ?
(if that's a reconciler, then your reconcile will be failing a lot, and you'll have to catch it via metrics, but in theory the controller will keep running - and is probably better than a panic for some use-cases (EDIT: while not really polluting user code much)).
feels very like an EitherOrBoth type. It can have a name, generated_name or both
yes, it does feel like that, but we have no guarantee on what upstream wants to evolve their naming conveniences to, and it's architecting ourselves into a corner by assuming invariants that upstream is not beholden to (and if you've listened in on sig apimachinery they do still propose generic metadata changes). generate_name
is also not that uncommon in my experience.
the real problem is that the types you put into Kubernetes is subtly different to the type you get out of Kubernetes
yeah, you are kind of describing ApplyConfigurations (variants where everything is optional so that it's easy to construct and use with server-side apply. See https://github.com/kube-rs/kube/issues/649 for some context and links. TL;DR: it makes it easier to use created structs, but it does not make it more ergonomic to deal with optionals from the apiserver. Their schema is ultimately as open as they say it is.
For namespace
, I spent some time splitting ObjectMeta
into ObjectMeta { namespace: String }
, ClusterObjectMeta { /* no namespace field */ }
and SubObjectMeta { namespace: Option<String> }
. The partial attempt is in https://github.com/Arnavion/k8s-openapi/compare/fix-134
One problem is that determining the scope of the resource happens while emitting its operations using the operation URLs, but the field needs to be known before-hand when the struct definition is emitted, or ideally even before that at spec fixup time so that the generator itself doesn't need to change the metadata type based on the scope. I'm still trying to think of a nice way to write the code that doesn't require duplicating the logic for determining the scope.
Having
.metadata.name
and.metadata.namespace
as optional causes us headaches all over our code. We either have to decideunwrap
because we know the value will be there, or implementResult
for the case of a missing value that will never be missing.In reality a Kubernetes object without a name (baring the edge case of asking Kubernetes to generate the name) is invalid; likewise a namespace scoped Kubernetes object without a namespace is invalid. As mentioned above this means unnecessary error handle, and allows the creation of resources without a
name
and/ornamespace
which appear to be perfectly valid until you attempt to submit them to Kubernetes.I propose that there should be four versions of
ObjectMeta
to cover the valid cases and simplify the 99% usage cases.For namespaced resources:
For cluster resources:
For namespaced resources which can take
generate_name
Alternatively each resource could generate its own version of
ObjectMeta
which is valid and correct for that resource.