asyncapi / spec

The AsyncAPI specification allows you to create machine-readable definitions of your asynchronous APIs.
https://www.asyncapi.com
Apache License 2.0
4.25k stars 269 forks source link

[2.0.0 REVIEW] Clarify usage of JSON References (/Pointers) for non-JSON data structures #216

Open jstoiko opened 5 years ago

jstoiko commented 5 years ago

The introduction of schemaFormat in Message Object adds support for schema languages other than JSON Schema. The $referencing mechanism however was designed to only support JSON data structures as it explicitly mentions the JSON Reference mechanism. So this needs additional clarification.

To be clear, any schema languages which underlying format is JSON (E.g. Avro) is not of concern. Schema languages which underlying format is YAML (E.g. RAML DataType, OpenAPI in YAML) are of lesser concern because simple wording could be added to the effect that a JSON Reference can be made to the JSON-equivalent of the YAML-formatted schema.

One could always “come-up” with some meaning around what comes after # for non-JSON formatted schemas. However, the lack of explicit definition of what it means to use a JSON Reference/Pointer to refer to non-JSON data structures would leave things up to interpretation, and as a result may introduce a divergence of behaviors across tools supporting the AsyncAPI Spec.

Let’s take Protobuf for example. What happens when there are nested types and I want to refer to a type within a nesting? E.g.

# Foo.proto
message Foo {
  message Bar {
    string name = 1;
  }
}

How do I refer to Bar? Is itFoo.proto#/Foo/Bar or Foo.proto#/Foo.Bar? The former looks more “right” to me, however one could make a case that Foo.Bar is how one would refer to Bar from within Protobuf and therefor is the way it should be referred to in the Pointer. Since a Protobuff document is not a JSON data structure, there is nothing to contradict this argument.

It would be interesting to take protobuff and maybe one more -- say XSD -- and think of all the corner cases, probably by going over the JSON Reference and JSON Pointer specs and attempt to apply each rule to both schema language. If any of those rules are contentious or may leave room for interpretation, then we should think of some wording to add to the AsyncAPI 2.0 spec. Another solution could be to have some kind of registry or mini-spec for each schema languages and have the rules defined there.

IMO, a much (much!) simpler -- and arguably better -- solution would be to restrict the use of JSON Reference to JSON and YAML -formatted schemas. We would say something along the lines of:

The Reference Object SHALL either be a URI or a JSON Reference. A JSON Reference is comprised of a URI followed by an optional JSON Pointer as described in this draft. A JSON Reference SHALL only be used to refer to a schema that is formatted in either JSON or YAML. In the case of a YAML-formatted Schema, the JSON Reference SHALL be applied to the JSON representation of that schema. The JSON representation SHALL be made by applying the conversion described here.

To sum-up:

  1. clarify how $ref can be applied to YAML data structures
  2. refer to an existing mechanism used to translate YAML to JSON
  3. clarify referencing mechanism for any non-JSON and non-YAML schema languages
fmvilas commented 5 years ago

Yes. This needs to be revised. Thanks for pointing out, Jonathan!

fmvilas commented 5 years ago

Deleted all the comments to add "jstoiko", "vromero", and "antoniogarrote" as contributors. To remove noise on this thread.

fmvilas commented 5 years ago

Ok, so I've done some initial research on the subject and here's the result.


JSON Pointer examples to consider (from RFC 6901)

Given the following JSON object:

{
    "foo": ["bar", "baz"],
    "": 0,
    "a/b": 1,
    "c%d": 2,
    "e^f": 3,
    "g|h": 4,
    "i\\j": 5,
    "k\"l": 6,
    " ": 7,
    "m~n": 8
}

These are the results for each JSON Pointer:

#            // the whole document
#/foo        ["bar", "baz"]
#/foo/0      "bar"
#/           0
#/a~1b       1
#/c%25d      2
#/e%5Ef      3
#/g%7Ch      4
#/i%5Cj      5
#/k%22l      6
#/%20        7
#/m~0n       8

Following, we're going to apply the same JSON Pointers to a Protobuf file and an XSD file, and see what the result should be.

Protobuf

When using Protobuf, the JSON Pointer MUST point to either a message or an enum.

package my.org;

enum MyEnum {
  UNKNOWN = 0;
  STARTED = 1;
  RUNNING = 2;
}

message Outer {
  message Inner {
      int test = 1;
  }
  MyEnum enum_field = 9;
}
Alternative 1: Using JSON Pointer syntax
#              // Error. You can't reference the whole document. Only messages and enums are supported.
#/MyEnum       // The "MyEnum" enum
#/Outer        // The "Outer" message
#/Outer/Inner  // The "Inner" message
#/Outer/0      // The "Inner" message, as it's the first in declaration order.
#/Outer/1      // Error. Fields are not supported. Only messages and enums are supported.
#/Outer/2      // Error. Out of range.
#/             // Error. Only messages and enums are supported.
#/Outer~1Inner // Equivalent of "Outer/Inner" as a single name. Protobuf message and enum names only support A-Za-z0-9_. Implementations SHOULD NOT trigger error but MUST fail to find anything.
#/c%25d        // Equivalent of c%d. Protobuf message and enum names only support A-Za-z0-9_. Implementations SHOULD NOT trigger an error but MUST fail to find anything.
#/e%5Ef        // Equivalent of e^f. Protobuf message and enum names only support A-Za-z0-9_. Implementations SHOULD NOT trigger an error but MUST fail to find anything.
#/g%7Ch        // Equivalent of g|h. Protobuf message and enum names only support A-Za-z0-9_. Implementations SHOULD NOT trigger an error but MUST fail to find anything.
#/i%5Cj        // Equivalent of i\j. Protobuf message and enum names only support A-Za-z0-9_. Implementations SHOULD NOT trigger an error but MUST fail to find anything.
#/k%22l        // Equivalent of k"l. Protobuf message and enum names only support A-Za-z0-9_. Implementations SHOULD NOT trigger an error but MUST fail to find anything.
#/%20          // Equivalent of a white-space character. Protobuf message and enum names only support A-Za-z0-9_. Implementations SHOULD NOT trigger an error but MUST fail to find anything.
#/m~0n         // Equivalent of m~n. Protobuf message and enum names only support A-Za-z0-9_. Implementations SHOULD NOT trigger an error but MUST fail to find anything.
Alternative 2: Following Protobuf syntax (not JSON Pointer-compatible as it's lacking the initial /)
#              // Error. You can't reference the whole document. Only messages and enums are supported.
#MyEnum        // The "MyEnum" enum
#Outer         // The "Outer" message
#Outer.Inner   // The "Inner" message
#Outer.0       // The "Inner" message, as it's the first in declaration order.
#Outer.1       // Error. Fields are not supported. Only messages and enums are supported.
#Outer.2       // Error. Out of range.
#Outer~1Inner  // Equivalent of "Outer/Inner" as a single name. Protobuf message and enum names only support A-Za-z0-9_. Implementations SHOULD NOT trigger error but MUST fail to find anything.
#Outer%2EInner // Equivalent of "Outer.Inner" as a single name. Protobuf message and enum names only support A-Za-z0-9_. Implementations SHOULD NOT trigger error but MUST fail to find anything.
#c%25d         // Equivalent of c%d. Protobuf message and enum names only support A-Za-z0-9_. Implementations SHOULD NOT trigger an error but MUST fail to find anything.
#e%5Ef         // Equivalent of e^f. Protobuf message and enum names only support A-Za-z0-9_. Implementations SHOULD NOT trigger an error but MUST fail to find anything.
#g%7Ch         // Equivalent of g|h. Protobuf message and enum names only support A-Za-z0-9_. Implementations SHOULD NOT trigger an error but MUST fail to find anything.
#i%5Cj         // Equivalent of i\j. Protobuf message and enum names only support A-Za-z0-9_. Implementations SHOULD NOT trigger an error but MUST fail to find anything.
#k%22l         // Equivalent of k"l. Protobuf message and enum names only support A-Za-z0-9_. Implementations SHOULD NOT trigger an error but MUST fail to find anything.
#%20           // Equivalent of a white-space character. Protobuf message and enum names only support A-Za-z0-9_. Implementations SHOULD NOT trigger an error but MUST fail to find anything.
#m~0n          // Equivalent of m~n. Protobuf message and enum names only support A-Za-z0-9_. Implementations SHOULD NOT trigger an error but MUST fail to find anything.
Alternative 3: Following Protobuf syntax and compatible with JSON Pointer
#              // Error. You can't reference the whole document. Only messages and enums are supported.
#/MyEnum        // The "MyEnum" enum
#/Outer         // The "Outer" message
#/Outer.Inner   // The "Inner" message
#/Outer.0       // The "Inner" message, as it's the first in declaration order.
#/Outer.1       // Error. Fields are not supported. Only messages and enums are supported.
#/Outer.2       // Error. Out of range.
#/              // Error. Only messages and enums are supported.
#/Outer~1Inner  // Equivalent of "Outer/Inner" as a single name. Protobuf message and enum names only support A-Za-z0-9_. Implementations SHOULD NOT trigger error but MUST fail to find anything.
#/Outer%2EInner // Equivalent of "Outer.Inner" as a single name. Protobuf message and enum names only support A-Za-z0-9_. Implementations SHOULD NOT trigger error but MUST fail to find anything.
#/c%25d         // Equivalent of c%d. Protobuf message and enum names only support A-Za-z0-9_. Implementations SHOULD NOT trigger an error but MUST fail to find anything.
#/e%5Ef         // Equivalent of e^f. Protobuf message and enum names only support A-Za-z0-9_. Implementations SHOULD NOT trigger an error but MUST fail to find anything.
#/g%7Ch         // Equivalent of g|h. Protobuf message and enum names only support A-Za-z0-9_. Implementations SHOULD NOT trigger an error but MUST fail to find anything.
#/i%5Cj         // Equivalent of i\j. Protobuf message and enum names only support A-Za-z0-9_. Implementations SHOULD NOT trigger an error but MUST fail to find anything.
#/k%22l         // Equivalent of k"l. Protobuf message and enum names only support A-Za-z0-9_. Implementations SHOULD NOT trigger an error but MUST fail to find anything.
#/%20           // Equivalent of a white-space character. Protobuf message and enum names only support A-Za-z0-9_. Implementations SHOULD NOT trigger an error but MUST fail to find anything.
#/m~0n          // Equivalent of m~n. Protobuf message and enum names only support A-Za-z0-9_. Implementations SHOULD NOT trigger an error but MUST fail to find anything.

XSD

When using XSD, the JSON Pointer MUST point to either an element, a complexType, a simpleType or the whole document (in which situation would be the same as referring to the first element).

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
           xmlns:tns="http://tempuri.org/PurchaseOrderSchema.xsd"
           targetNamespace="http://tempuri.org/PurchaseOrderSchema.xsd"
           elementFormDefault="qualified">
 <xsd:element name="PurchaseOrder" type="tns:PurchaseOrderType"/>
 <xsd:complexType name="PurchaseOrderType">
  <xsd:sequence>
   <xsd:element name="ShipTo" type="tns:USAddress" maxOccurs="2"/>
   <xsd:element name="BillTo" type="tns:USAddress"/>
  </xsd:sequence>
  <xsd:attribute name="OrderDate" type="xsd:date"/>
 </xsd:complexType>

 <xsd:complexType name="USAddress">
  <xsd:sequence>
    <xsd:element name="name"   type="xsd:string"/>
    <xsd:element name="street" type="xsd:string"/>
    <xsd:element name="city"   type="xsd:string"/>
    <xsd:element name="state"  type="xsd:string"/>
    <xsd:element name="zip"    type="tns:ZipCode"/>
    <xsd:element name="nestedComplexTypeElement">
      <xsd:complexType>
        <xsd:sequence>
          <xsd:element name="first" type="xsd:string"/>
          <xsd:element name="middle" type="xsd:string" minOccurs="0"/>
          <xsd:element name="last" type="xsd:string"/>
        </xsd:sequence>
      </xsd:complexType>
    </xsd:element>
  </xsd:sequence>
  <xsd:attribute name="country" type="xsd:NMTOKEN" fixed="US"/>
 </xsd:complexType>

 <xsd:simpleType name="ZipCode">
    <xsd:restriction base="xsd:integer">
      <xsd:minInclusive value="0"/>
      <xsd:maxInclusive value="80000"/>
    </xsd:restriction>
  </xsd:simpleType>
</xsd:schema>
#                                      // The first element. In this case, the "PurchaseOrder" element.
#/PurchaseOrder                        // The "PurchaseOrder" element.
#/PurchaseOrderType                    // The "PurchaseOrderType" complex type.
#/USAddress                            // The "USAddress" complex type.
#/USAddress/name                       // The "USAddress/name" element.
#/USAddress/0                          // Error. The sequence tag is not supported.
#/USAddress/nestedComplexTypeElement/0 // The complexType of the "USAddress/nestedComplexTypeElement" element.
#/USAddress/nestedComplexTypeElement/1 // Error. Out of range.
#/                                     // Error. The name of the reference can't be an empty string.
#/USAddress~1name                      // Equivalent of "USAddress/name" as a single name. XSD names must follow NCName definition and therefore `/` is not supported. Implementations SHOULD NOT trigger error but MUST fail to find anything.
#/USAddress%25name                     // Equivalent of USAddress%name. XSD names must follow NCName definition and therefore `%` is not supported. Implementations SHOULD NOT trigger error but MUST fail to find anything.
#/USAddress%5Ename                     // Equivalent of USAddress^name. XSD names must follow NCName definition and therefore `^` is not supported. Implementations SHOULD NOT trigger error but MUST fail to find anything.
#/USAddress%7Cname                     // Equivalent of USAddress|name. XSD names must follow NCName definition and therefore `|` is not supported. Implementations SHOULD NOT trigger error but MUST fail to find anything.
#/USAddress%5Cname                     // Equivalent of USAddress\name. XSD names must follow NCName definition and therefore `\` is not supported. Implementations SHOULD NOT trigger error but MUST fail to find anything.
#/USAddress%22name                     // Equivalent of USAddress"name. XSD names must follow NCName definition and therefore `"` is not supported. Implementations SHOULD NOT trigger error but MUST fail to find anything.
#/%20                                  // Equivalent of a white-space character. XSD names must follow NCName definition and therefore it's not supported. Implementations SHOULD NOT trigger error but MUST fail to find anything.
#/USAddress~0name                      // Equivalent of USAddress~name. XSD names must follow NCName definition and therefore `~` is not supported. Implementations SHOULD NOT trigger error but MUST fail to find anything.
fmvilas commented 5 years ago

Added the following to the initial research:

Alternative 2: Following Protobuf syntax (not JSON Pointer-compatible as it's lacking the initial /)
#              // Error. You can't reference the whole document. Only messages and enums are supported.
#MyEnum        // The "MyEnum" enum
#Outer         // The "Outer" message
#Outer.Inner   // The "Inner" message
#Outer.0       // The "Inner" message, as it's the first in declaration order.
#Outer.1       // Error. Fields are not supported. Only messages and enums are supported.
#Outer.2       // Error. Out of range.
#Outer~1Inner  // Equivalent of "Outer/Inner" as a single name. Protobuf message and enum names only support A-Za-z0-9_. Implementations SHOULD NOT trigger error but MUST fail to find anything.
#Outer%2EInner // Equivalent of "Outer.Inner" as a single name. Protobuf message and enum names only support A-Za-z0-9_. Implementations SHOULD NOT trigger error but MUST fail to find anything.
#c%25d         // Equivalent of c%d. Protobuf message and enum names only support A-Za-z0-9_. Implementations SHOULD NOT trigger an error but MUST fail to find anything.
#e%5Ef         // Equivalent of e^f. Protobuf message and enum names only support A-Za-z0-9_. Implementations SHOULD NOT trigger an error but MUST fail to find anything.
#g%7Ch         // Equivalent of g|h. Protobuf message and enum names only support A-Za-z0-9_. Implementations SHOULD NOT trigger an error but MUST fail to find anything.
#i%5Cj         // Equivalent of i\j. Protobuf message and enum names only support A-Za-z0-9_. Implementations SHOULD NOT trigger an error but MUST fail to find anything.
#k%22l         // Equivalent of k"l. Protobuf message and enum names only support A-Za-z0-9_. Implementations SHOULD NOT trigger an error but MUST fail to find anything.
#%20           // Equivalent of a white-space character. Protobuf message and enum names only support A-Za-z0-9_. Implementations SHOULD NOT trigger an error but MUST fail to find anything.
#m~0n          // Equivalent of m~n. Protobuf message and enum names only support A-Za-z0-9_. Implementations SHOULD NOT trigger an error but MUST fail to find anything.
Alternative 3: Following Protobuf syntax and compatible with JSON Pointer
#              // Error. You can't reference the whole document. Only messages and enums are supported.
#/MyEnum        // The "MyEnum" enum
#/Outer         // The "Outer" message
#/Outer.Inner   // The "Inner" message
#/Outer.0       // The "Inner" message, as it's the first in declaration order.
#/Outer.1       // Error. Fields are not supported. Only messages and enums are supported.
#/Outer.2       // Error. Out of range.
#/              // Error. Only messages and enums are supported.
#/Outer~1Inner  // Equivalent of "Outer/Inner" as a single name. Protobuf message and enum names only support A-Za-z0-9_. Implementations SHOULD NOT trigger error but MUST fail to find anything.
#/Outer%2EInner // Equivalent of "Outer.Inner" as a single name. Protobuf message and enum names only support A-Za-z0-9_. Implementations SHOULD NOT trigger error but MUST fail to find anything.
#/c%25d         // Equivalent of c%d. Protobuf message and enum names only support A-Za-z0-9_. Implementations SHOULD NOT trigger an error but MUST fail to find anything.
#/e%5Ef         // Equivalent of e^f. Protobuf message and enum names only support A-Za-z0-9_. Implementations SHOULD NOT trigger an error but MUST fail to find anything.
#/g%7Ch         // Equivalent of g|h. Protobuf message and enum names only support A-Za-z0-9_. Implementations SHOULD NOT trigger an error but MUST fail to find anything.
#/i%5Cj         // Equivalent of i\j. Protobuf message and enum names only support A-Za-z0-9_. Implementations SHOULD NOT trigger an error but MUST fail to find anything.
#/k%22l         // Equivalent of k"l. Protobuf message and enum names only support A-Za-z0-9_. Implementations SHOULD NOT trigger an error but MUST fail to find anything.
#/%20           // Equivalent of a white-space character. Protobuf message and enum names only support A-Za-z0-9_. Implementations SHOULD NOT trigger an error but MUST fail to find anything.
#/m~0n          // Equivalent of m~n. Protobuf message and enum names only support A-Za-z0-9_. Implementations SHOULD NOT trigger an error but MUST fail to find anything.
fmvilas commented 5 years ago

Even though it's possible to hack the JSON Reference object to make it work with non-JSON structures, I'm not quite happy as —in the end— it's a hack. And honestly, I don't like hacks and even less in formal documents like a specification. My gut feeling is telling me so much to stop here and support only JSON/YAML formats but my UX background is holding me hard from abandoning the XSD and Protobuf use cases.

In any case, it looks like this requires further investigation and discussion and I don't think it would make it to version 2.0.0 as it is. Therefore, I think @jstoiko proposal of limiting it to JSON/YAML formats is what makes more sense. At least for now. Support for XSD and Protobuf can be brought through tooling that will transform these formats into their JSON Schema equivalent.

WaleedAshraf commented 5 years ago

I'd agree with @fmvilas and @jstoiko proposal of limiting to JSON/YAML for now. Which actually makes sense as the early stage of AsyncAPI definition.

Adding support (hacks) for more format types VS better support the few selected ones (JSON/YAML)? I'd go with the second option here.

WaleedAshraf commented 5 years ago

But this shouldn't stop us from keep working on defining support for non-JSON schemas.

fmvilas commented 4 years ago

Reopening this issue so we can work on it in the future.

github-actions[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity :sleeping: It will be closed in 30 days if no further activity occurs. To unstale this issue, add a comment with detailed explanation. Thank you for your contributions :heart:

asyncapi-bot commented 3 years ago

:tada: This issue has been resolved in version 1.0.0 :tada:

The release is available on GitHub release

Your semantic-release bot :package::rocket:

magicmatatjahu commented 3 years ago

@jstoiko @WaleedAshraf @fmvilas

if you are not interested in solving problem, I apologize for pinging you

The described proposal/problem itself is related to my proposal, which I extended to use references to nested non-JSON schema objects - Proposal to allow defining schema format other than default one (AsyncAPI Schema) - please see section Update.

Any feedback are more than welcome! :)

GreenRover commented 1 year ago

I found this in spec: https://github.com/asyncapi/spec/blob/v2.1.0/spec/asyncapi.md?plain=1#L1181-L1183

This issue was specified long time ago. May be by mistake. But we should keep this in mind!

Ok this spec there is contractionary with: https://github.com/asyncapi/spec/blob/v2.1.0/spec/asyncapi.md?plain=1#L1300

Because JSON Reference don't allow refences to external files this way.

smoya commented 9 months ago

Since the introduction of the Multiformat Schema Object, we already support protobuf for instance. In the docs, we mention the following:

Non-JSON-based schemas (e.g., Protobuf or XSD) MUST be inlined as a string.

I consider this issue can be closed. In case you still believe it should be kept open, please mention it and I will proceed.

GreenRover commented 9 months ago

I am not sure. The feature. $ref or similar to *.proto files is still open and was planned as a 3.1 feature. There for i like to keep this issue open.

smoya commented 9 months ago

@GreenRover would you mind pasting here any link related to the planned feature? Thanks!