asyncapi / spec

The AsyncAPI specification allows you to create machine-readable definitions of your asynchronous APIs.
https://www.asyncapi.com
Apache License 2.0
4.01k stars 261 forks source link

trait vs. allOf ref vs. component ref - CloudEvents #1057

Open black-snow opened 1 month ago

black-snow commented 1 month ago

I just stumbled across traits, must've missed them before. My initial thought was: 'oh nice, now I can pull out the CloudEvents JSON schema from the specs and don't need all the copypasta'.

I then found this and there is ... no trait:

components:
  schemas:
    whateverPayload:
      type: object
      allOf:
        - $ref: 'https://raw.githubusercontent.com/cloudevents/spec/v1.0.1/spec.json'
      properties:
        data:
          $ref: '#/components/schemas/somethingElse'

Now I'm a bit confused. If I get it right, it'll just merge in the whole CloudEvents spec object, including its properties (why the allOf, though? Would it conflict with the explicit properties otherwise?). Now that's what I thought traits would be useful for. If we can just do it like that, why do we have traits? I'm pretty sure I'm missing something.

Another thing in the back of my mind is schema validation. If I externalize parts of the schema, how will validation behave? Applications now need to reach out to GH and fetch the CloudEvents spec from there. Do I want this? Will it be cached? Should I rather split it into the spec I write and a "compiled" spec with everything inlined?

So, I was wondering how you do it and what traits are there for (the docs are rather brief).

github-actions[bot] commented 1 month ago

Welcome to AsyncAPI. Thanks a lot for reporting your first issue. Please check out our contributors guide and the instructions about a basic recommended setup useful for opening a pull request.
Keep in mind there are also other channels you can use to interact with AsyncAPI community. For more details check out this issue.

Lazzaretti commented 2 weeks ago

Hi @black-snow,

I created the issue https://github.com/cloudevents/spec/issues/1276 because I think it is a bit more complicated. There are multiple bindings per protocol for CloudEvents, so I think we need different traits depending on the protocol and mode (structured or binary). In structured mode, https://raw.githubusercontent.com/cloudevents/spec/v1.0.1/spec.json should probably be great.

For the fetching, caching, etc. questions: This is not part of the spec, but an implementation detail (but please correct me if I'm wrong).

derberg commented 2 weeks ago

Let me throw here a similar discussion we had in AsyncAPI Slack -> https://asyncapi.slack.com/archives/C0230UAM6R3/p1712148557132589 (you need to first register with https://asyncapi.com/slack-invite)

lemme paste it here

Lukasz Gornicki
  3 months ago
@Fran Méndez
 I went through https://github.com/asyncapi/spec/issues/108 but could not find the specific reason.
so yeah, can you look into the past with magic ball and check if you remember why traits on message.payload are not allowed. I thought that maybe because of complexity of schema, but then I was like: “what do we care, it is simple merge”, and we anyway allow it in the message.headers  so why not allow people doing it. I think it is much better than what happened with Open API 3.1 and change of definition of $ref
34 replies

Jonas Lagoni
  [3 months ago](https://asyncapi.slack.com/archives/C0230UAM6R3/p1712148557132589)
How are you gonna apply traits to Protobuf that is being inlined with string or referenced? :smile:
Or just in general it open up a can of worms I would highly recommend not to open :joy:

Lukasz Gornicki
  3 months ago
well, limitations can be excluded, but why generalize? I know we have protobuf and xsd that are strings, but why is this an issue, string is a string, no matter if protobuf or not, we just replace it 1:1 with what was in the trait :shrug::skin-tone-2:

Lukasz Gornicki
  3 months ago
trait is schema agnostic, just should take whatever you have and apply on existing json, so if it is string, then just replace it with new string

Lukasz Gornicki
  [3 months ago](https://asyncapi.slack.com/archives/C0230UAM6R3/p1712148853916399?thread_ts=1712148557.132589&cid=C0230UAM6R3)
and for those using yaml/json it is a perfect tool, that many need, that is why OpenAPI broke their $ref definition

Lukasz Gornicki
  3 months ago
which I think was a bad idea, worst than using traits

Lukasz Gornicki
  [3 months ago](https://asyncapi.slack.com/archives/C0230UAM6R3/p1712149191076049?thread_ts=1712148557.132589&cid=C0230UAM6R3)
traits are reusability on steroids

Jonas Lagoni
  3 months ago
I think the question is more, why should WE fix the underlying schema formats for not allowing something like traits?

Jonas Lagoni
  3 months ago
It's not really OpenAPI that broke $ref it was JSON Schema that changed that validation of $ref keywords should happen on keyword level instead of being unwrapped in the enclosing object
:+1::skin-tone-2:
1

Jonas Lagoni
  [3 months ago](https://asyncapi.slack.com/archives/C0230UAM6R3/p1712149238347759?thread_ts=1712148557.132589&cid=C0230UAM6R3)
Of course it broke for OpenAPI as well since they changed to using the newest version, but it came from the underlying JSON Schema standard (edited) 

Jonas Lagoni
  3 months ago
I think the question is more, why should WE fix the underlying schema formats for not allowing something like traits?
It should be each open standard that defines how those things work. Sure, we could do it for our AsyncAPI Schema Object, but it just complicates things...

Lukasz Gornicki
  [3 months ago](https://asyncapi.slack.com/archives/C0230UAM6R3/p1712149289745289?thread_ts=1712148557.132589&cid=C0230UAM6R3)
why when talking about traits for message.payload we need to bring schema into discussion and not just treat it as normal JSON as we do with other parts of message or operation?
and in case of message.headers we don’t, we normally allow traits, which is great, and works, and there is nothing complicated about it.
trait is not “for schema” feature, so why would we even consider why should WE fix the underlying schema formats that we’re fixing anything for them

Jonas Lagoni
  [3 months ago](https://asyncapi.slack.com/archives/C0230UAM6R3/p1712149310790739?thread_ts=1712148557.132589&cid=C0230UAM6R3)
Are you not referring to wanting to allow this?
{payload: {properties: {test1: {type: string}}}, traits: [{payload: {properties: {test2: {type: string}}}}]}
and in case of message.headers we don’t, we normally allow traits, which is great, and works, and there is nothing complicated about it.
I actually missed that part that headers can now be multi schema format. (edited) 

Jonas Lagoni
  [3 months ago](https://asyncapi.slack.com/archives/C0230UAM6R3/p1712149348754499?thread_ts=1712148557.132589&cid=C0230UAM6R3)
trait is not “for schema” feature, so why would we even consider why should WE fix the underlying schema formats that we’re fixing anything for them
I really dont like it, cause you are mixing between what is AsyncAPI and "other standard" features. The fact that we already have it scares me tbh and I think its a mistake  :shrug: And we have not seen the downside yet because people have not used it enough (edited) 

Lukasz Gornicki
  [3 months ago](https://asyncapi.slack.com/archives/C0230UAM6R3/p1712149808077099?thread_ts=1712148557.132589&cid=C0230UAM6R3)
no, no traits inside payload
we can already do
    userSignedUp:
      name: userSignedUp
      title: User signed up message
      summary: Emitted when a user signs up
      traits:
        - $ref: './myref.json'
      payload:
        $ref: '#/components/schemas/userSignedUpPayload' 
and myref.json can contain headers which is awesome, like https://github.com/asyncapi/spec/blob/master/examples/streetlights-kafka-asyncapi.yml#L185
and I’m questioning restriction that payload cannot be placed inside traits

Jonas Lagoni
  [3 months ago](https://asyncapi.slack.com/archives/C0230UAM6R3/p1712149858358449?thread_ts=1712148557.132589&cid=C0230UAM6R3)
Sorry yea, changed my example

Lukasz Gornicki
  3 months ago
people use it quite a lot, if we see it in example provided by IBM folks
and I see it promoted by Google and usage with CloudEvents

Lukasz Gornicki
  [3 months ago](https://asyncapi.slack.com/archives/C0230UAM6R3/p1712149891259869?thread_ts=1712148557.132589&cid=C0230UAM6R3)
 And we have not seen the downside yet because people have not used it enough
I do not know any data that could confirm this statement (edited) 

Jonas Lagoni
  3 months ago
We dont have any :shrug:

Jonas Lagoni
  [3 months ago](https://asyncapi.slack.com/archives/C0230UAM6R3/p1712149944014599?thread_ts=1712148557.132589&cid=C0230UAM6R3)
Its my speculation

Jonas Lagoni
  3 months ago
I have not seen an AsyncAPI document that uses traits + multi schema format to overwrite headers across different schema formats

Jonas Lagoni
  3 months ago
And headers are always a single-level depth object, which is why it's not budding heads there or confuses users IMO :shrug: (edited) 

Jonas Lagoni
  [3 months ago](https://asyncapi.slack.com/archives/C0230UAM6R3/p1712151088149969?thread_ts=1712148557.132589&cid=C0230UAM6R3)
What is the underlying use-case where you see traits as being useful for payload that each underlying schema format does not support?

Lukasz Gornicki
  3 months ago
one of the things is what OpenAPI does with modification of $ref and writing that summary and description can be placed next to $ref
the other is CloudEvents, where with trait, you could apply all the cloudevents standard fields with simple trait like:
traits:
        - $ref: 'https://raw.githubusercontent.com/meteatamel/asyncapi-basics/main/samples/account-service-cloudevents/cloudevents-v1.0.1-asyncapi-traits.yaml'
of course the example shows headers not payload but you get my point

Lukasz Gornicki
  [3 months ago](https://asyncapi.slack.com/archives/C0230UAM6R3/p1712151299058709?thread_ts=1712148557.132589&cid=C0230UAM6R3)
and CloudEvents is open standard, while others are for sure there, some internal standards where some common fields are defined

Lukasz Gornicki
  3 months ago
and of course allOf from JSON Schema is not a nice alternative, 100% not, especially that it is JSON Schema only

Lukasz Gornicki
  3 months ago
current traits algorithm relies on https://datatracker.ietf.org/doc/html/rfc7386 and we would still do, with all its limitations :shrug::skin-tone-2: not thinking about schema at all (edited) 

Fran Méndez
  [3 months ago](https://asyncapi.slack.com/archives/C0230UAM6R3/p1712151457169639?thread_ts=1712148557.132589&cid=C0230UAM6R3)
IIRC it was because payload was any so it was weird to have that mechanism. There are two approaches here:
We become conservative and not allow traits there (current state of the art).
We allow it and it will only make sense when your payload is defined with JSON/YAML objects.
My initial approach was to go for 2 but I got some pushback from the Mulesoft folks and decided not to fight this battle yet. Honestly, I think #2 is better here. It gives a lot of power to some people (probably the majority?) and harms no one (or I can't find a way it'd harm).
It reminds me of the discussion in the web building community about Progressive Enhancement vs Graceful Degradation. (edited) 

Lukasz Gornicki
  [3 months ago](https://asyncapi.slack.com/archives/C0230UAM6R3/p1712151483318269?thread_ts=1712148557.132589&cid=C0230UAM6R3)
 when your payload is defined with JSON/YAML objects
exactly, as in the end it is JSON Merge Patch that we use. We do not reinvent anything

Lukasz Gornicki
  3 months ago
and conservative approach leads nowhere as then people try to workaround with existing stuff like:
https://developers.redhat.com/articles/2021/06/02/simulating-cloudevents-asyncapi-and-microcks
https://atamel.dev/posts/2023/05-23_asyncapi_cloudevents/
for us easier as we do not take responsibility for it, and can later just say: we never recommended that
I think we should bend on those community workaround, add traits on payload are just better and easier than for example:
    userSignedUpPayload:
      type: object
      allOf:
        - $ref: 'https://raw.githubusercontent.com/cloudevents/spec/v1.0.1/spec.json'
      properties:
        data:
          $ref: '#/components/schemas/userSignedUpData'
especially that till this day we did not find a great consensus for rendering allOf in react component :smile:

Fran Méndez
  [3 months ago](https://asyncapi.slack.com/archives/C0230UAM6R3/p1712151523608379?thread_ts=1712148557.132589&cid=C0230UAM6R3)
conservative approach leads nowhere
I know what you mean but would like to point out something. Being conservative leads precisely to this, community coming up with workarounds. Or not, and community doesn't care because they don't need it anyway. In this case, it's clear it's something that the community wants and it's leading to workarounds, which anyway serves as a proof for us that it's actually needed. Back in time, I did not have that data "proof" so the conservative approach is IMHO the best because adding means a minor version but removing requires a major one.

Lukasz Gornicki
  [3 months ago](https://asyncapi.slack.com/archives/C0230UAM6R3/p1712151545601199?thread_ts=1712148557.132589&cid=C0230UAM6R3)
yup, totally agree - still should be use case driven

Jonas Lagoni
  3 months ago
the other is CloudEvents, where with trait, you could apply all the cloudevents standard fields with simple trait like:
traits:
        - $ref: 'https://raw.githubusercontent.com/meteatamel/asyncapi-basics/main/samples/account-service-cloudevents/cloudevents-v1.0.1-asyncapi-traits.yaml'
Is that not what schemaFormat: 'application/cloudevents+json; version=0.2; charset=utf-8 is for? So standard values coming stright from cloudevents is not needed to be defined?
and of course allOf from JSON Schema is not a nice alternative, 100% not, especially that it is JSON Schema only
In the examples you linked they used traits + allOf which then confuses stuff, but to me that is what schemaFormat helps solve natively without us having to change the behaviour.

Lukasz Gornicki
  [3 months ago](https://asyncapi.slack.com/archives/C0230UAM6R3/p1712151571652589?thread_ts=1712148557.132589&cid=C0230UAM6R3)
every single schema for a message that follows CloudEvents, it must contain https://raw.githubusercontent.com/cloudevents/spec/v1.0.1/spec.json as these are standard fields that need to be “around” data field where actually data schema is defined
so schemaFormat: 'application/cloudevents+json; version=0.2; charset=utf-8 change nothing, and anyway, it is not a schema format
cloudevents is an envelope standard, so basically what additional technical fields your payload should contain when you send it - for example time stamp of event, its format and name of the filed that has it - to unify events structure across different systems

Fran Méndez
  [3 months ago](https://asyncapi.slack.com/archives/C0230UAM6R3/p1712151597129679?thread_ts=1712148557.132589&cid=C0230UAM6R3)
Also, schemaFormat: 'application/cloudevents+json; version=0.2; charset=utf-8 won't work because it defines the basics of these fields but, in many cases, you want to restrict the values of some fields to a specific set. For instance, you want the id to match the messageId property so you want to add const: 'yourId'. You can't do it if that's abstracted as a schemaFormat.

I started the conversation after reading https://atamel.dev/posts/2023/05-23_asyncapi_cloudevents/ and https://developers.redhat.com/articles/2021/06/02/simulating-cloudevents-asyncapi-and-microcks

black-snow commented 2 weeks ago

Thanks for sharing that thread. Very insightful.

As someone having that use case I absolutely agree there is a use case :D

It gets really messy once you produce or consume just half a handful of events. Right now my schema looks like

components:
  messages:
    WhatevsEvent:
      # ...
      payload:
        allOf:
          - $ref: "#/components/schemas/Envelope"          
          - $ref: "#/components/schemas/WhatevsEventEnvelopeOverrides"
          - type: object
            properties:
              data:
                $ref: "#/components/schemas/WhatevsEventPayload"
# ...
  schemas:
    Envelope:
      allOf:
        - $ref: 'https://raw.githubusercontent.com/cloudevents/spec/v1.0.2/cloudevents/formats/cloudevents.json'
        - $ref: "#/components/schemas/MyEnvelopeExtensions"

Where overrides contains constants for subject, for example. It works but it's not pretty. Perhaps there's an obvious way to make this less of a puzzle. I was hoping traits were the answer (or part of it at least).