GREsau / schemars

Generate JSON Schema documents from Rust code
https://graham.cool/schemars/
MIT License
797 stars 223 forks source link

Flattening maps into structs is problematic with Kubernetes #133

Open jcaesar opened 2 years ago

jcaesar commented 2 years ago

Preface

My original problem was that the schema derived for the following

struct ValueMap {
    foo: String,
    #[serde(flatten)]
    bar: HashMap<String, serde_json::Value>,
}

contains both properties: {…} and additionalProperties: true, which Kubernetes disallows. At this point, I think I have found an acceptable solution for this problem. (But if there's some case where using a Visitor to rewrite additionalProperties to x-kubernetes-preserve-unknown-fields would cause problems, or if schemas with both properties and additionalProperties are generally meaningless and this deserves some broader fix, that would be nice to know.)

The actual problem

The following three structs

struct StringMap {
    foo: String,
    #[serde(flatten)]
    bar: HashMap<String, String>,
}

struct Value {
    foo: String,
    #[serde(flatten)]
    bar: serde_json::Value,
}

struct No {
    foo: String,
}

generate the exact same schema. I'm not sure about the general case, but for Kubernetes, this is problematic: If x-kubernetes-preserve-unknown-fields is not declared, objects containing other properties than the ones declared in properties will either be rejected by validation, or the non-declared properties will be deleted on forced submission. This cannot be fixed by a Visitor, since the information whether a flattened field was present is not available to the visitor. Do you have any suggestion on how this could be handled or fixed?

Example code

ahl commented 2 years ago

FWIW if you do this:

#[derive(JsonSchema)]
#[schemars(deny_unknown_fields)]
struct StringMap {
    foo: String,
    #[serde(flatten)]
    bar: std::collections::HashMap<String, String>,
}

#[derive(JsonSchema)]
#[schemars(deny_unknown_fields)]
struct Value {
    foo: String,
    #[serde(flatten)]
    bar: std::collections::HashMap<String, serde_json::Value>,
}

#[derive(JsonSchema)]
#[schemars(deny_unknown_fields)]
struct No {
    foo: String,
}

You get different schemas:

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "title": "StringMap",
  "type": "object",
  "required": [
    "foo"
  ],
  "properties": {
    "foo": {
      "type": "string"
    }
  },
  "additionalProperties": {
    "type": "string"
  }
}
{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "title": "Value",
  "type": "object",
  "required": [
    "foo"
  ],
  "properties": {
    "foo": {
      "type": "string"
    }
  },
  "additionalProperties": true
}
{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "title": "No",
  "type": "object",
  "required": [
    "foo"
  ],
  "properties": {
    "foo": {
      "type": "string"
    }
  },
  "additionalProperties": false
}

Does that suffice to let your visitor identify the cases where it would add the extension?

jcaesar commented 2 years ago

Hmm. Not sure what to make of this.

I guess kube-rs would have to rewrite additionalProperties: { type: string } as well (not sure to what).

It is a bit weird that it generates additionalProperties: false for #[serde(flatten)] serde_json::Value (but imho, using serde_json::Value is a bit weird anyway. That'll fail serialization if the value is not a Map). additionalProperties: false is also forbidden in K8s CRDs, but I guess kube-rs could just delete it.

ahl commented 2 years ago

It is a bit weird that it generates additionalProperties: false for #[serde(flatten)] serde_json::Value (but imho, using serde_json::Value is a bit weird anyway. That'll fail serialization if the value is not a Map).

additionalProperties: true? Agreed that it's weird and kind of nonsensical...

additionalProperties: false is also forbidden in K8s CRDs, but I guess kube-rs could just delete it.

That's surprising:

If "additionalProperties" is absent, it may be considered present with an empty schema as a value. https://datatracker.ietf.org/doc/html/draft-wright-json-schema-validation-00#section-5.18

The empty schema is always valid so it's equivalent to having a default of true.

GREsau commented 1 month ago

In schemars 1.0.0-alpha.3, the schemas output from the structs in the original issue description are:

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "title": "StringMap",
  "type": "object",
  "properties": {
    "foo": {
      "type": "string"
    }
  },
  "additionalProperties": {
    "type": "string"
  },
  "required": [
    "foo"
  ]
}
{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "title": "Value",
  "type": "object",
  "properties": {
    "foo": {
      "type": "string"
    }
  },
  "required": [
    "foo"
  ]
}
{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "title": "No",
  "type": "object",
  "properties": {
    "foo": {
      "type": "string"
    }
  },
  "required": [
    "foo"
  ]
}

So I believe StringMap and No are correct, but Value is missing "additionalProperties": true. And that should be fixed by #315

GREsau commented 1 month ago

Also, adding deny_unknown_fields to any of the structs will now change the schema to:

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "title": "StringMap",
  "type": "object",
  "properties": {
    "foo": {
      "type": "string"
    }
  },
  "additionalProperties": false,
  "required": [
    "foo"
  ]
}

This matches the serde behaviour of failing deserialization when including any properties other than foo