cue-lang / cue

The home of the CUE language! Validate and define text-based and dynamic configuration
https://cuelang.org
Apache License 2.0
5.14k stars 294 forks source link

encoding/openapi: additionalProperties should be false for closed struct #267

Open cueckoo opened 3 years ago

cueckoo commented 3 years ago

Originally opened by @proppy in https://github.com/cuelang/cue/issues/267

Per https://json-schema.org/understanding-json-schema/reference/object.html#properties and https://github.com/OAI/OpenAPI-Specification/blob/master/versions/3.0.2.md additionalProperties defaults to true.

Currently encoding/openapi doesn't set any additionalProperties for closed struct (and definitions), causing the resulting JSON schema to be more relaxed than the cue contract.

Example

Given:

DummyDef :: {
    something: string
}

encoding/openapi generates

{
  "type": "object",
  "required": [
    "something"
  ],
  "properties": {
    "something": {
      "type": "string",
      "format": "string"
    }
  }
}

which validates:

{
    "something": "foo",
    "morethings": "bar"
}

whereas cue does fails validation:

field "morethings" not allowed in closed struct:
    ./dummydef.json:3:19
cueckoo commented 3 years ago

Original reply by @mpvl in https://github.com/cuelang/cue/issues/267#issuecomment-578118220

Even though you're technically right, this is a somewhat tricky one. The correct definition depends a bit on how an API definition is used.

For instance, by definition, Protobuf are "open": if an application receives a message that has fields not defined by the current proto, the spec says to ignore these fields. However, code generated from protobuf definitions will fail if a non-existing fields are used and an application is not. What you see here is two modes of use: within a single application, one wants a single definition to be consistent. Even if the definition says to ignore unknown fields for outside messages, configuring for a known version, one still wants to catch typos. There are basically two modes of usage.

For this reason, proto to CUE conversion should keep a CUE definition closed. Received messages should be mapped to a CUE type using something akin to a type cast (unify ignoring closedness then eliminate unknown fields).

For the same reason, mapping CUE to OpenAPI may want to include a conversion from open to closed.

Anyway, running this by some of the users of this to find out their interpretation. I think it is reasonable to support this, but the question is whether this should be an option of some form.

cueckoo commented 3 years ago

Original reply by @proppy in https://github.com/cuelang/cue/issues/267#issuecomment-578155846

Thanks for the thorough explanation.

There are basically two modes of usage.

Do you think it needs to get more granular than that? i.e: when "exporting" a cue schema into something else, one could decide to "relax" or "restrain" it's openness on a per struct basic (a) ?

Anyway, running this by some of the users of this to find out their interpretation.

What would be a good forum for this?

but the question is whether this should be an option of some form

If (a) makes sense, maybe allowing users to selectively "apply" another .cue schema on top of all the definitions of another would be enough? Maybe using a field comprehension?

Few experiments below:

opening all closed structs

a: close({
  a: int
})

b: close({
  b: string
})

OpenMe:: {
  "a": a
  "b": b
}

for k,v in OpenMe {
 "\(k)": v & {...}
}

c: OpenMe.a
c: c: 1

Closing all structs

a: {
  a: int
}

b: {
  b: string
}

CloseMe:: {
  "a": a
  "b": b
}

for k,v in CloseMe {
 "\(k)": close(v)
}

c: CloseMe.a
c: c: 1

And a few follow-up questions

cueckoo commented 3 years ago

Original reply by @mpvl in https://github.com/cuelang/cue/issues/267#issuecomment-579163020

So my suspicions were correct: https://kubernetes.io/docs/tasks/access-kubernetes-api/custom-resources/custom-resource-definitions/#validation

CRD's put restrictions on the use of additionalProperties.

What would be a good forum for this?

I just asked Stefan Schimanski himself. :)

The reason for this is exactly as expected: to allow for extendible APIs one should not restrict the addition of fields. Clients can still complain about unknown fields, but the OpenAPI definition (for CRDs) is supposed to indicate all allowed values and thus for any proper API, additionalProperties cannot be set to false.

As to (a), I hope we can do something better than this. These kind of approaches tend to be not scalable, from a UX perspective. I can also imagine coming up with a single interpretation of closed structs that works for both case. My current stance on closedness in CUE is that closedness only applies to a particular version of an API, but should be ignored for incoming messages.

Anyway, in conclusion, the requested change would break a common use cases of OpenAPI: CRDs in K8s. A lot of thought went into this interpretation when it comes to proper APIs.

In other words, the interpretation of closedness in OpenAPI is just different from that of CUE. OpenAPI represents closedness regarding accepted messages, whereas CUE considers it in terms of current API version. So for now, it seems better to not add additionalProperties: false. Even if these interpretation prove false, the current mapping is legal: an OpenAPI schema generated from CUE should not reject values accepted by the corresponding CUE, but may accepted values rejected by the original CUE. So it is also the more conservative stance to take.

What is you particular use case for this feature? I could imagine making this an option. On the other hand, this seems to encourage bad API design. So this should only be added if there is a concrete need for it and with the proper note of caution.

cueckoo commented 3 years ago

Original reply by @proppy in https://github.com/cuelang/cue/issues/267#issuecomment-579304181

What is you particular use case for this feature?

Was trying to use CUE to author JSON schemas for IoT devices characteristics.

So this should only be added if there is a concrete need for it and with the proper note of caution.

Maybe the right distinction to draw here is OpenAPI vs JSON schema export? While the interpretation of OpenAPI closedness is different from that of CUE, maybe the interpretation JSON schema closedness could be different from OpenAPI (and closer to CUE), since the scope of JSON schema can go beyond API payload definition?

cueckoo commented 3 years ago

Original reply by @mpvl in https://github.com/cuelang/cue/issues/267#issuecomment-579418214

Yeah, possibly. Their standard really should address these modes.

OpenAPI also allows additionalProperties: false, it is just the CRD that doesn't. I think in the case of OpenAPI, given its purpose, it highly dubious to support additionalProperties: false. But I see your point this is different for JSON Schema.

I think it should be an option, the question is what the default should be. For OpenAPI the default should be to not generate this. Ideally the default is the same for both, but arguably the default should be close for JSON Schema.

The right approach here, I think, is to port the Schema generation from the openapi package to the jsonschema package, make it proper JSON schema, and then see if we can express OpenAPI generation in terms of JSON schema generation.

Note that CRD put other restrictions on OpenAPI, including some normalizations, that are not strictly necessary in JSON Schema. I think the OpenAPI schema will execute the normalizations if possible.

cueckoo commented 3 years ago

Original reply by @khepin in https://github.com/cuelang/cue/issues/267#issuecomment-777144434

Was coming here to open that exact same issue so I'll chime in with my use case: I care about jsonschema, not about openapi, but I can currently only generate jsonschema by going through openapi.

I want to be able to share the definition of the shape of objects that can go in an elasticsearch index. There might be multiple codebases feeding into that index and we want to ensure documents don't get added that would result in creating too many automatic mappings in the index.

cueckoo commented 3 years ago

Original reply by @timbunce in https://github.com/cuelang/cue/issues/267#issuecomment-862546625

I care about jsonschema, not about openapi, but I can currently only generate jsonschema by going through openapi.

That's my use-case as well.

myitcv commented 3 years ago

Nudging @mpvl for a follow-up on this, post migration (#1078)