Closed kazk closed 3 years ago
The "Files changed" tab lags both Firefox and Chromium to death, so this is going to be a great PR to review :D
I know this has been blocked on me for some time, but I've been busy and will probably be busy for another two weeks due to IRL stuff. Sorry.
Okay, sorry for the delay. I should have time more consistently now.
I'll continue the discussion from https://github.com/Arnavion/k8s-openapi/pull/104#discussion_r659357386 here so that it's easier to find.
You said:
Methods used in derived
JsonSchema
to help are private too. Maybe I'm missing something, though. I haven't manually implementedJsonSchema
for complex types. We'll need to generate code to buildSchemaObject
fromswagger20::Schema
. For each$ref
, we'll need to do<T as schemars::JsonSchema>::add_schema_as_property
, but this is not part of the public API.
But AFAICT this isn't a problem. JsonSchema::json_schema
needs to return a Schema
, specifically a Schema::Object
which contains a SchemaObject
, for which all the fields we would set are public.
And specifically, for $ref
s you'd create schemars::schema::SchemaObject { reference: Some("io.k8s.apimachinery.pkg.apis.meta.v1.ObjectMeta".to_owned()), ..Default::default() }
Anyway, going through your change, I remembered why I suggested implementing schemars::JsonSchema
instead of returning a serde_json::Value
:
Each of these is an infinite recursion. Currently this is only a problem for JsonSchemaProps
(which is why there's also a circular-reference workaround for this type in another place), but in the future there might be more.
Impling schemars::JsonSchema
allows us to use gen.subschema_for
for the field types, which solves the circular reference problem.
If we want something generic that isn't tied to schemars
but nevertheless leaves refs unflattened to allow recursive types, we can have our custom traits to have an agnostic equivalent of subschema_for
:
trait SchemaGenerator {
// Unlike schemars::gen::SchemaGenerator::subschema_for, this doesn't need to return a value.
fn visit_subschema<T: Schema>(&mut self);
}
impl Schema for Pod {
fn schema(gen: &mut SchemaGenerator) -> (&'static str, serde_json::Value, BTreeMap<&'static str, fn() -> serde_json::Value>) {
let schema_name = "io.k8s.api.core.v1.Pod";
gen.visit_subschema::<crate::apimachinery::pkg::apis::meta::v1::ObjectMeta>();
let schema = serde_json! {{
"properties": {
"metadata": { "$ref": "io.k8s.apimachinery.pkg.apis.meta.v1.ObjectMeta" }
},
}};
}
}
But even with this, the impl of k8s_openapi::SchemaGenerator
for schemars::gen::SchemaGenerator
has to come from k8s_openapi
itself (because of orphan rules), so k8s_openapi
will end up having a dep on schemars
anyway.
Okay, sorry for the delay. I should have time more consistently now.
No problem, thanks for reviewing.
JsonSchema::json_schema
needs to return aSchema
, specifically aSchema::Object
which contains aSchemaObject
, for which all the fields we would set are public.And specifically, for
$ref
s you'd createschemars::schema::SchemaObject { reference: Some("io.k8s.apimachinery.pkg.apis.meta.v1.ObjectMeta".to_owned()), ..Default::default() }
Can that SchemaObject
resolve those $ref
s? I didn't think so, but haven't looked into it much. My comment was based on looking at the output of #[derive(JsonSchema)]
for structs with JsonSchema
impled fields ($ref
s), and finding uses of #[doc(hidden)]
methods.
Yeah, the circular reference in JSONSchemaProps*
is a problem.
Impling
schemars::JsonSchema
allows us to usegen.subschema_for
for the field types, which solves the circular reference problem.
I originally thought infinite recursion is still a problem even if we use gen.subschema_for
because the schema cannot have $ref
s. And with this restriction, calling T::schema()
is basically the same as gen.subschema_for::<T>()
. But I noticed inline_subschemas
documentation has "Some references may still be generated in schemas for recursive types.", so I guess it does prevent infinite recursion (the schema will be invalid for us, though).
For the use case of generating a schema for CRD, circular references should never be a problem because it's impossible to describe with the limited schema.
If we go with schemars
route, it'll prevent infinite recursion and produce invalid schema for us instead when that happens. If what you described works, then it might be worth the effort. We should change the feature to schemars
.
If it's not as easy to do, maybe replace self referencing fields in JSONSchemaProps*
(and anything else in the future) with {}
(any type)? Or skip them entirely? Not great, but I don't think anyone is interested in supporting them anyway because the main use case is CRD schema.
and finding uses of
#[doc(hidden)]
methods.
Right, they're #[doc(hidden)]
but they're just convenience wrappers for the most part. Essentially the flow is:
https://github.com/GREsau/schemars/blob/master/schemars_derive/src/lib.rs#L121
-> https://github.com/GREsau/schemars/blob/master/schemars_derive/src/schema_exprs.rs#L12
-> https://github.com/GREsau/schemars/blob/master/schemars_derive/src/schema_exprs.rs#L436
-> https://github.com/GREsau/schemars/blob/master/schemars/src/_private.rs#L29
-> https://github.com/GREsau/schemars/blob/master/schemars/src/_private.rs#L36
... and for our codegen we can just do the same expansion ourselves.
I originally thought infinite recursion is still a problem even if we use
gen.subschema_for
because the schema cannot have$ref
s.
The way it works with schemars
is:
impl JsonSchema for Metadata {
fn json_schema(gen) -> Schema {
// (1)
Schema::Object(SchemaObject {
object: Some(...),
})
}
}
impl JsonSchema for Pod {
fn json_schema(gen) -> Schema {
Schema::Object(SchemaObject {
object: Some(ObjectValidation {
properties: {
"metadata": gen.subschema_for("io...Metadata"), // (2)
},
}),
})
}
}
(1) is the actual schema of Metadata
, and as part of (2) gen.subschema_for()
does get this actual Schema
and persist it in its internal list of subschemas, but the Schema
it returns from that function is a synthesized one that is a Schema::Object(SchemaObject { reference: Some("io...Metadata") })
instead. So the Schema
returned by Pod::json_schema
does contain $ref
s, but that's fine, because the targets of the refs have been registered with the generator inside subschema_for
.
If we go with
schemars
route, it'll prevent infinite recursion and produce invalid schema for us instead when that happens.
Can you clarify why it would be invalid? Do you mean that you'd consider a schema with $ref
s to be invalid? Why is that?
Thanks for expanding on schemars
. I'll look into it. They seem to be considering making #[doc(hidden)]
in methods in JsonSchema
public, so that helps (https://github.com/GREsau/schemars/blob/55b860428e4526ef34886cb877af58256de6b046/schemars/src/lib.rs#L364-L374).
Do you mean that you'd consider a schema with $refs to be invalid? Why is that?
It's a valid JSON Schema/Open API, but it's invalid for our use case because $ref
is not allowed in CRD schema. https://github.com/kubernetes/kubernetes/issues/62872
We use inline_subschemas
when generating.
inline_subschemas
will still work for your use case, then. It's the SchemaGenerator
's job to handle that.
inline_subschemas
will still work for your use case, then.
No, I knew that. I meant the recursive types won't be supported (for our use case) either way. The current version has infinite recursion, and schemars
version will have $ref
.
For Pod
, the generated code should look like the following:
impl crate::schemars::JsonSchema for Pod {
fn schema_name() -> String {
"io.k8s.api.core.v1.Pod".to_owned()
}
fn json_schema(__gen: &mut crate::schemars::gen::SchemaGenerator) -> crate::schemars::schema::Schema {
crate::schemars::schema::Schema::Object(crate::schemars::schema::SchemaObject {
metadata: Some(Box::new(crate::schemars::schema::Metadata {
description: Some("Pod is a collection of containers that can run on a host. This resource is created by clients and scheduled onto hosts.".to_owned()),
..Default::default()
})),
object: Some(Box::new(crate::schemars::schema::ObjectValidation {
properties: std::array::IntoIter::new([
(
"apiVersion".to_owned(),
crate::schemars::schema::Schema::Object(crate::schemars::schema::SchemaObject {
metadata: Some(Box::new(crate::schemars::schema::Metadata {
description: Some("APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources".to_owned()),
..Default::default()
})),
instance_type: Some(std::convert::Into::into(crate::schemars::schema::InstanceType::String)),
..Default::default()
}),
),
(
"kind".to_owned(),
crate::schemars::schema::Schema::Object(crate::schemars::schema::SchemaObject {
metadata: Some(Box::new(crate::schemars::schema::Metadata {
description: Some("Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds".to_owned()),
..Default::default()
})),
instance_type: Some(std::convert::Into::into(crate::schemars::schema::InstanceType::String)),
..Default::default()
}),
),
(
"metadata".to_owned(),
__gen.subschema_for::<crate::apimachinery::pkg::apis::meta::v1::ObjectMeta>(),
),
(
"spec".to_owned(),
__gen.subschema_for::<crate::api::core::v1::PodSpec>(),
),
(
"status".to_owned(),
__gen.subschema_for::<crate::api::core::v1::PodStatus>(),
),
]).collect(),
required: std::array::IntoIter::new([
"metadata",
]).map(std::borrow::ToOwned::to_owned).collect(),
..Default::default()
})),
..Default::default()
})
}
}
gen.subschema_for::<T>()
should work, but we'll lose the overrides (description
and extensions) set in the reference object in the upstream spec. To be fair, that's the correct behavior.
When an object contains a
$ref
property, the object is considered a reference, not a schema. Therefore, any other properties you put there will not be treated as JSON Schema keywords and will be ignored by the validator.https://json-schema.org/understanding-json-schema/structuring.html#ref
This object cannot be extended with additional properties and any properties added SHALL be ignored. https://swagger.io/specification/#reference-object
I started a thing here if you want to build off of it. k8s-openapi-codegen-common/src/templates/impl_schema.rs
needs to be filled out to handle all the SchemaKind
s.
I got it mostly working locally. Output of JSONSchemaProps
with inline_subschemas
: https://gist.github.com/kazk/0de1996d49018d53557dab4b22b0c5f7 (some types are still missing)
I was able to override description for $ref
s (e.g., fixes port
field having the description of IntOrString
). PodSpec
output: https://gist.github.com/kazk/33bb9f9b22da93349372748ae84a1ede
Extensions can be added similarly, so there's no downsides now. I'll open a new PR later.
Superseded by #105.
$ref
is inlined by callingT::schema()
and appending any other values, including Kubernetes extensions.schemars
Closes #86
Output of pretty printed
Pod::schema()
: https://gist.github.com/kazk/e4a56b7ff9431416a1c29a1c602a7c8a