json-schema-org / referencing

Proposals for a possible specification encompassing the varying uses of "$ref"
MIT License
6 stars 1 forks source link

An AsyncAPI perspective #14

Open jonaslagoni opened 2 years ago

jonaslagoni commented 2 years ago

Even though background lays the foundation for an AsyncAPI perspective, for the sake of clarity I am going to bring it to light here and try to map the proposals to the requirements and whether they solve the problems we are facing.

The alternative approach we are looking at is making a superset of the JSON Reference standard and restrictively defining the missing behaviors that work for our case (something along the lines of this PR).

The referencing standard is only part one of solving the referencing discrepancies as tooling is the second part that should be able to work with any JSON format which uses some form of referencing standard. See more in this discussion: https://github.com/orgs/asyncapi/discussions/485

The main use-case for us and the reason we want to change the referencing standard is to allow a message to define its payload with something different than JSON formats, which includes but is not limited to Protobuf and XSD formats.

The requirements we are looking at are the following (this primarily comes from questions raised when trying to build a referring tool and where the specs did not provide an answer):

  1. Should always produce a valid JSON document (resolving referenced resource should always be compatible with JSON)
  2. Should define the behavior for referencing non-JSON resources in a JSON document (this can be as simple as resolving the content into a string value)
  3. Should define the behavior for referencing JSON data
  4. Should define the behavior for referencing JSON data that also have reference behavior, and how they interconnect / or not (say two standards both use $ref the new standard should define a clear separation between the two and how referencing tools should interpret it)
  5. Should define the behavior of nested schemas within the same file (so there is no difference between what the spec allows and what tooling enables)

If you have any questions/concerns feel free to raise them below! I am gonna block out some time to map the proposals to the requirements we have at some point, if you are bored, feel free to jump-start the process 😄!

jdesrosiers commented 2 years ago

Thanks for writing this up! I'd be happy to write up how the JRef proposal addresses these requirements, but I'd like to ask a couple clarifying questions first.

  1. Should always produce a valid JSON document (resolving referenced resource should always be compatible with JSON)

It isn't possible to resolve and inline recursive references into a JSON document.

For example.

{
  "foo": 42,
  "bar": { "$ref": "#" }
}

If you try to resolve and inline this schema, you'll end up with an infinitely sized document.

{
  "foo": 42,
  "bar": {
    "foo": 42,
    "bar": {
      "foo": 42,
      "bar": { ... recursing forever ... }
    }
  }
}

However, it is possible to resolve recursive references into an in-memory data structure in such a way that you can work with it as JSON compatible data without having to think about references. It would be like working with a structure like this,

const value = { foo: 42 };
value.bar = value;

console.log(value.bar.bar.bar.bar.bar.bar.foo); // Result => 42

So, since it isn't possible to fulfill this requirement as written, is there another variation of the requirement that also satisfies whatever use case prompted this requirement?

  1. Should define the behavior of nested schemas within the same file (so there is no difference between what the spec allows and what tooling enables)

I'm not sure I understand what you mean here. I think this is saying something about embedded schemas (bundling), but I'm not sure what. Could you elaborate on what you're looking for here?

jonaslagoni commented 2 years ago

I'd be happy to write up how the JRef proposal addresses these requirements

That would be awesome 🔥

So, since it isn't possible to fulfill this requirement as written, is there another variation of the requirement that also satisfies whatever use case prompted this requirement?

That is a very good point. The requirement was mainly focused towards the non-JSON resources as reference tooling has to provide a way to interact with the referenced resource in some way or form. For example, if you reference a protobuf file how can tooling interact with it? This can be decided in the spec, or it is left to the context specification (as defined in JRI) or even to tooling. I don't think we really have an ultimatum here, just as long as it's well-defined it would clear it.

As you outline there is the in-memory example, which definitely would satisfy it. It's however also possible to convert the in-memory data structure into and from JSON while still supporting circular references, it just requires some customization. We actually have this as a "feature" for our AsyncAPI parser, because we have UI cases where we need to serialize the recursive structure. But that probably deserves it's own issue to discuss it and possible solutions 😄

I'm not sure I understand what you mean here. I think this is saying something about embedded schemas (bundling), but I'm not sure what. Could you elaborate on what you're looking for here?

Definitely, and you are spot on, that's definitely one of the drivers.

In AsyncAPI and OpenAPI we have components that we store the reusable components and JSON Schema $def, the standard must clearly define the behavior for how references work depending on the context specification.

I know for JSON Schema the automapping between $id and components in $def (compound schemas) is a great hit, but it's definitely not something we see with great positivity from an AsyncAPI perspective at the moment 😄 But let's not focus on this here, this is just to map the specs to AsyncAPI and what it enables and restricts, then we can always bring it up later 🙂

I hope that clarify it, otherwise let me know!

jdesrosiers commented 2 years ago

Ok. I think I understand. Let me know if I have this right.

For (1), my understanding is that tooling needs to be able to abstract references and non-JSON data so that there's a consistent and well defined way to work with the document as only JSON-compatible types and data structures. It's not necessarily about converting it to a JSON document.

For (5), my understanding is that you're talking about where references are allowed and what is allowed to be referenced. If { "$ref": "#/foo" } appears somewhere in an AsyncAPI document where a reference isn't allowed, tooling should treat it as an object (or an error) rather than as a reference.

jdesrosiers commented 2 years ago

Here's how I see that JRef applies to your requirements for AsyncAPI. I hope it's helpful.

  1. Should always produce a valid JSON document (resolving referenced resource should always be compatible with JSON)

In JRef, references are always transparent. Developers are always working only with the standard JSON types. When you get a value from a JRef document, the JRef tooling follows the reference automatically and gives you the referenced value as if the referenced value was always there.

Example

const doc = JRef.load({
  "foo": { "$href": "#/bar" },
  "bar": "42"
});

const foo = JRef.get("/foo", doc); // => 42
const bar = JRef.get("/bar", doc); // => 42
  1. Should define the behavior for referencing non-JSON resources in a JSON document (this can be as simple as resolving the content into a string value)

JRef allows implementations to define how they want non-JSON values to be represented. Everything is based on media types. You can have XML converted into a JRef compatible document, or just return it as a string. You would write a plugin for the JRef tooling describing the behavior you expect for each media type.

Example:

const doc = JRef.load({
  "schema": { "$href": "./my-xml-schema.xml" }
});

const xmlSchema = JRef.get("/schema", doc); // => Some representation of an XML document
const value = XML.get(someXPathExpression, xmlSchema); // => The value of applying the XPath expression to the XML document
  1. Should define the behavior for referencing JSON data

The JRef proposal covers the behavior of references in JRef documents. Although JRef is syntactically compatible with JSON, it's considered a different media type and must be configured as a plugin the same way you would for XML.

Example

const doc = JRef.load({
  "foo": { "$href": "./my-json-document.json" }
});

// ./my-json-document.json
// {
//    "aaa": { "$href": "#/bbb" },
//    "bbb": 42
// }

const foo = JRef.get("/foo", doc);
const aaa = foo.aaa; // => { "$href": "#/bbb" } # A value that looks like a JRef reference, but is actually just a plain JSON object with no special meaning
const bar = foo.bbb; // => 42
  1. Should define the behavior for referencing JSON data that also have reference behavior, and how they interconnect / or not (say two standards both use $ref the new standard should define a clear separation between the two and how referencing tools should interpret it)

Every document has a media type and each media type has it's own rules for referencing. If you reference a JSON Schema from a JRef document, once you follow that reference, then only the rules for JSON Schema apply. Because it's document based, there are some consequences to be aware of. If your AsyncAPI document identifies as a JRef media type, you can't embed a JSON Schema in that document because it's a different media type. You can only reference to a JSON Schema. That shouldn't be a big deal because it's what you already have to do for non-JSON compatible types already.

One way around that is to define AsyncAPI as a separate media type that defines that certain locations are to be interpreted as JSON Schemas. Then it can be a reference or inline. But, that would require extra work to extend the standard JRef tooling.

Another way around that is to introduce some syntax for embedding external documents in a JRef document. I've explained why an embedding/bundling feature is not included in https://github.com/json-schema-org/referencing/issues/7, but I'm open to discussion if there's demand for it.

If both AsyncAPI and JSON Schema decide to adopt JRef, this problem goes away, but I don't think that's likely from the JSON Schema side.

  1. Should define the behavior of nested schemas within the same file (so there is no difference between what the spec allows and what tooling enables)

JRef doesn't restrict where a reference can be or what can be referenced. Because the tooling abstracts away references, there should be no developer burden and AsyncAPI document authors are empowered to use references however fits their needs. If AsyncAPI does want to make restrictions like this, they would need to define their own media type and would have to extend the standard tooling.

jonaslagoni commented 1 year ago

@jdesrosiers thanks for the quick feedback! Do you mind starting a new issue just for the JRef perspective, or that I create a new issue with your comment? Then we can discuss it there instead of mixing it together with jri 🙂 Then I will start another one just for jri.

jonaslagoni commented 1 year ago

Added the JRI perspective here where I answered the simple questions above: https://github.com/json-schema-org/referencing/issues/17

I do not go into depth here, but it seems like it clarifies the questions.

jonaslagoni commented 1 year ago

Going forward the discussion around this subject will slowly progress for AsyncAPI, so don't expect anything like a 🚤, as the proposal is not aimed at the upcoming spec 3.0 but a feature likely end of this year at some point - https://github.com/asyncapi/spec/issues/881

Unless someone else champions the change through earlier then what I can 😄

handrews commented 1 year ago

Unless someone else champions the change through earlier then what I can

Do you mean on the AsyncAPI side or also speeding things up on this side?

jonaslagoni commented 1 year ago

Do you mean on the AsyncAPI side or also speeding things up on this side?

On the AsyncAPI side we don't "need" to take any decisions on our end before the end of this year probably, that is unless someone else champions the issue through instead of me 😄