OAI / Arazzo-Specification

The Arazzo Specification - A Tapestry for Deterministic API Workflows
https://spec.openapis.org/arazzo/latest.html
Apache License 2.0
214 stars 43 forks source link

Launch Implementer Version of the Specification #49

Closed frankkilcommins closed 5 months ago

frankkilcommins commented 1 year ago

This issue is to work towards deliver of an implementer version of the Workflows Specification that can be used by interested vendors (or community members) to build prototype tooling and feedback to the SIG on aspects of the specification.

How a bill becomes law?

We'll work with the OAS TDC to identify the steps needed to release and communicate on an initial implementer spec version. These discussions will also cover how the SIG Workflows group should also function moving forward (perhaps follow IETF process/approach for working groups).

Planned timelines and activities

frankkilcommins commented 1 year ago

From Neal Caidin on ability to host the Workflows Specification under the OAI:

Hello all. I asked LF if our charter covered adopting a new spec, like the Workflow spec. I hope I asked the right question? Here is the answer - As long as the new specification relates to "providing technical metadata for APIs" then it would appear to fall within the scope of the OpenAPI project. Please note, however, that the license for Specifications is the Apache-2.0 license. We now have various alternatives, and if the community would like to understand what open specification licenses are available to them, please let us know.

Link to slack conversation: https://open-api.slack.com/archives/C1137F8HF/p1683808672092199

ralfhandl commented 11 months ago

Section Data Types: link to the Formats Registry, ideally per format.

darrelmiller commented 11 months ago

The info.version property is defined as the version of the "Description" which could be composed of multiple documents. In OpenAPI the info.version is defined as the version of the document. Was this different intentional?

ralfhandl commented 11 months ago

Workflows Specification Object Example, step

  - stepId: getPetStep
    description: retrieve a pet by status from the GET pets endpoint
    operationRef: https://petstore3.swagger.io/api/v3/openapi.json#/paths/users/~findbystatus~1{status}/get
    dependsOn: loginStep
    parameters:
      - name: status
        in: query
        value: 'available'
      - name: Authorization
        in: header
        value: $steps.loginUser.outputs.sessionToken

Assuming the sessionToken output of the previous step loginUser is a "raw" token, then the Authorization header is probably of the form

Authorization: Bearer <sessionToken>

How would such a concatenation of the prefix Bearer and an output value be expressed?

I'll look for it in the remainder of the document, but it would be nice if the example already told me πŸ˜„.

darrelmiller commented 11 months ago

The literals description includes string but the table does not. It would be good to clarify whether single and/or double quotes are supported for strings. https://github.com/OAI/sig-workflows/blob/main/versions/1.0.0.md#literals

darrelmiller commented 11 months ago

It appears that in the OpenAPI spec we did a particularly bad job of specifying how to use $statusCode in an expression. Most of the examples in workflows treat statusCode as a numeric value during comparison. However, there is an example here https://github.com/OAI/sig-workflows/blob/main/versions/1.0.0.md#fixed-fields-10 that treats it as a string. This might be fine but it set off my spidey senses.

darrelmiller commented 11 months ago

https://github.com/OAI/sig-workflows/blob/main/versions/1.0.0.md#fixed-fields-4

The successCriteria property is a list of Criterion Objects that all must be satisfied. It might be worthwhile providing some guidance as to when to use an AND (&&) operator in a single criterion Object vs multiple criterionObjects. e.g. Will a list of criteron Objects "short circuit" if one fails?

darrelmiller commented 11 months ago

Is this

- condition: $statusCode == 200 

the same as

-  context: $statusCode
   condition: 200
   type: simple

? If so, what is the value of type simple?

darrelmiller commented 11 months ago

the retryAfter field in the Failure Action Object is described as "A non-negative decimal indicating the milliseconds delay". Is this really supposed to say Decimal? i.e. someone can define fractions of a millisecond? Is there a strong reason to deviate from the HTTP standard of using seconds for retryAfter?

darrelmiller commented 11 months ago

It would be valuable to have some examples and discussion around the onSuccess property. It would be good to move the Note about what happens when there are multiple success actions, up into the onSuccess property. It is the behavior of the array of success actions, not the success action itself that is being defined by the note.

https://github.com/OAI/sig-workflows/blob/main/versions/1.0.0.md#fixed-fields-6

ralfhandl commented 11 months ago

Section Step Object, field operationId

If more than one (non workflowsSpec type) source description is defined within a Workflows Description, then the operationId specified MUST be prefixed with the source name to avoid ambiguity or potential clashes.

How prefixed? Just prepend without separator character? An example would be useful.

Same with field workflowId.

ralfhandl commented 11 months ago

Section Step Object, field operationRef

A complete URI Template SHOULD be used.

The examples show a JSON Pointer with a fragment containing a path through the paths object, the URI template, ending at the operation object.

I find the examples convincing and am unable to deduce them from the field description.

ralfhandl commented 11 months ago

Section Workflow Object, field inputs:

A JSON Schema 2020-12 object representing the input parameters used by this workflow.

Are there any cases where this JSON Schema object wouldn't say type: object? What do I have to expect/accept as a tool implementor?

How would inputs be referenced if that schema isn't of type: object? The Runtime Expressions only allow

"$inputs." name

which seems to require type: object.

frankkilcommins commented 11 months ago

Section Data Types: link to the Formats Registry, ideally per format.

Thanks @ralfhandl - I've registered #96 for this.

frankkilcommins commented 11 months ago

defined as the version of the "Description"

Thanks @darrelmiller - this is being fixed via https://github.com/OAI/sig-workflows/pull/98

frankkilcommins commented 11 months ago

Workflows Specification Object Example, step

  - stepId: getPetStep
    description: retrieve a pet by status from the GET pets endpoint
    operationRef: https://petstore3.swagger.io/api/v3/openapi.json#/paths/users/~findbystatus~1{status}/get
    dependsOn: loginStep
    parameters:
      - name: status
        in: query
        value: 'available'
      - name: Authorization
        in: header
        value: $steps.loginUser.outputs.sessionToken

Assuming the sessionToken output of the previous step loginUser is a "raw" token, then the Authorization header is probably of the form

Authorization: Bearer <sessionToken>

How would such a concatenation of the prefix Bearer and an output value be expressed?

I'll look for it in the remainder of the document, but it would be nice if the example already told me πŸ˜„.

String functions representations like concatenation are not currently supported. If we deem it worthwhile, then I would propose that we handle in the same manner as a Criterion.condition in so far as literals can be combined with runtime expressions and also leverage and extended set of operators. Alternatively, we'd need to introduce a limited set of functions (this is thread if we pull at it).

I'd tend to treat this as an enhancement for a future minor rev.

frankkilcommins commented 11 months ago

It appears that in the OpenAPI spec we did a particularly bad job of specifying how to use $statusCode in an expression. Most of the examples in workflows treat statusCode as a numeric value during comparison. However, there is an example here https://github.com/OAI/sig-workflows/blob/main/versions/1.0.0.md#fixed-fields-10 that treats it as a string. This might be fine but it set off my spidey senses.

I don't see it as a big issue, but perhaps we could provide a little more clarity for implementors to take appropriate parsing responsibility before evaluating such a condition

frankkilcommins commented 11 months ago

The literals description includes string but the table does not. It would be good to clarify whether single and/or double quotes are supported for strings. https://github.com/OAI/sig-workflows/blob/main/versions/1.0.0.md#literals

@darrelmiller thanks - nice spot. Addressed (whoops) in #99

frankkilcommins commented 11 months ago

the retryAfter field in the Failure Action Object is described as "A non-negative decimal indicating the milliseconds delay". Is this really supposed to say Decimal? i.e. someone can define fractions of a millisecond? Is there a strong reason to deviate from the HTTP standard of using seconds for retryAfter?

Thanks @darrelmiller - Issue #100 created for this. I've no idea how I deviated here......... I will keep the decimal description unless you think the Retry-After header should also be adjusted: "A delay-seconds value is a non-negative decimal integer, representing time in seconds"

frankkilcommins commented 11 months ago

https://github.com/OAI/sig-workflows/blob/main/versions/1.0.0.md#fixed-fields-4

The successCriteria property is a list of Criterion Objects that all must be satisfied. It might be worthwhile providing some guidance as to when to use an AND (&&) operator in a single criterion Object vs multiple criterionObjects. e.g. Will a list of criteron Objects "short circuit" if one fails?

I think it's self explanatory but happy to add an example if needed. However I'm not sure how prescriptive I want to be here for authors. The general rule of thumb is that if you have assertions which can not easily fit into a single Criterion Object (e.g. you may need to check a status code as well as apply some JSONPath expressions to check response body information).

Yes, if one criterion fails, then the step is deemed as unsuccessful. That may trigger other actions depending on the presence of the onFailure fixed field

frankkilcommins commented 11 months ago

Is this

- condition: $statusCode == 200 

the same as

-  context: $statusCode
   condition: 200
   type: simple

? If so, what is the value of type simple?

@darrelmiller This made me smile, which is good!

Yes, they are effectively the same but somewhat unexpected usage. And the following is also the same:

- condition: $statusCode == 200 
  type: simple

'type' simple is to support those who want to be verbose in their expressiveness of the criterion. Most often, I would expect where short handed approaches to be leveraged for simple conditions.

Example:

- condition: $statusCode == 200 && $response.body != null

This would not be very clear if represented as the following IMHO as $statusCode is not the only context being evaluated

-  context: $statusCode
   condition: 200 && $response.body != null
   type: simple

Open to suggestions as to how to harden/clarify

frankkilcommins commented 11 months ago

Section Workflow Object, field inputs:

A JSON Schema 2020-12 object representing the input parameters used by this workflow.

Are there any cases where this JSON Schema object wouldn't say type: object? What do I have to expect/accept as a tool implementor?

How would inputs be referenced if that schema isn't of type: object? The Runtime Expressions only allow

"$inputs." name

which seems to require type: object.

@ralfhandl Thanks for raising. Yes, I would recommend that we restrict what we expect here. My expectation is type: object.

handrews commented 11 months ago

As @ralfhandl noted:

Section Workflow Object, field inputs:

A JSON Schema 2020-12 object representing the input parameters used by this workflow.

Is the support requirement strictly for JSON Schema 2020-12 rather than OAS 3.1's extensions to 2020-12? I do not care either way, I was just a bit surprised.

If it's not tracked elsewhere already (and my apologies if it is- GitHub search ignores the $ in $schema which makes it hard to search for in issues), you might want to add a note that while $schema is supported as a keyword, handling values other than the standard 2020-12 metaschema (or the OAS 3.1 dialect metaschema?) only MAY be supported. Or something like that.

ralfhandl commented 11 months ago

I would recommend that we restrict what we expect here. My expectation is type: object.

@frankkilcommins Maybe go one step further and make inputs a map of parameter names whose values then are JSON Schema objects.

That would also make it more symmetric with outputs.

ralfhandl commented 11 months ago

Section Parameter Object says

There are five possible locations specified by the in field:

followed by a list of six values.

If the step has a workflowId I assume that all parameters must have in: workflow. In which case it would be easier to just omit the in field and remove the value workflow from the list, bringing it back to five possible values.

ralfhandl commented 11 months ago

Section Parameter Object, fixed field target says

Can be useful for targeting specific request body part.

Seems that I MUST provide exactly one parameter that is in: body and does not have a target to have a body whose parts I can then modify with further parameters that have a target.

If target is present, MUST I then specify in: body? Or is there an unmentioned in value if bodyPart?

An example would be helpful.

ralfhandl commented 11 months ago

Sections Success Action Object and Failure Action Object, field stepID say

The referenced stepId SHOULD be within the current workflow.

If the step is in a different workflow, how do I provide values for the inputs of that other workflow?

Section Step Object, field stepId says

The id SHOULD be unique amongst all steps described in the workflow.

So I can have the same step id within one workflow, and I definitely can have the same step id within different workflows.

Which of these identically named steps is to be executed? All of them? The first? A randomly chosen one?

As an implementor I'd prefer to have a "MUST" in these three sentences to unambiguously identify the step to go to.

ralfhandl commented 11 months ago

The Table of Contents has multiple issues:

  1. "Criterion Object" is listed before "Reference Object", and the corresponding sections have the opposite order
  2. Workflows Specification - Version 1.0.0 looks odd, which may be caused by "Workflows Specification" being heading level 1 and "Version 1.0.0" being heading level 4 - where are the two intermediate levels?
  3. Same for Definitions - Workflows Description: "Definitions" is heading level 2, "Workflows Description" is heading level 5
  4. Sections "Security Considerations" and "IANA Considerations" are not listed
  5. Appendix B is not listed
ralfhandl commented 11 months ago

Section Reference Object, field $ref says

This MUST be in the form of a URI.

Which is only half of the truth: the fragment part - if present - apparently MUST be in the form of a JSON Pointer URI Fragment Identifier Representation.

Otherwise how is an implementation supposed to interpret the fragment part?

ralfhandl commented 11 months ago

Section Criterion Object says

String comparisons SHOULD be case insensitive.

Please make this a MUST for either sensitive or insensitive, don't care much which.

Different behavior across different workflow execution engines would be a nuisance.

ralfhandl commented 11 months ago

Section Runtime Expressions, ABNF rules message-header-reference and message-payload-reference: are these leftovers from deleted text or "sneak previews" of things to come?

Either way I'd remove them from this specification version.

ralfhandl commented 11 months ago

Section IANA COnsiderations, vnd. prefix:

I see OAI in the standards category, not in the vendors category.

handrews commented 11 months ago

Here is some last-minute feedback. I apologize if I missed something obvious- my recent health problems have meant I've only been able to read-through quickly just now and not research past discussions as I normally would. If something I raise has already been debated and resolved, feel free to just say so- no need to re-explain or find links to the old discussions, if I want to find them I can dig it up myself.


The workflowsSpec field SHOULD be used by tooling to interpret the Workflows Description.

Using a SHOULD here leaves room for tools to do something arbitrarily different. Usually a SHOULD means that there are specific but relatively rare good reasons to disregard it. What are the use cases for disregarding this SHOULD? I am asking because we learned through JSON Schema that not locking down the bootstrapping process for interpreting a document leads to incredibly thorny interoperability problems.


A URL to a source description to be used by a Workflow. MUST be in the form of a URL. MAY be defined using relative references as defined by RFC3986.

Referencing WHATWG here is problematically inconsistent with OAS 3.x. If you must reference WHATWG (a decision with which I rather disagree, because that spec is insanely hard to interpret unless you're writing a parser exactly the way they want you to, although I understand perhaps I'm too late with this feedback), "URL" is not the correct term as it is defined in terms of in-memory structs. The correct term would be "URL String". I'm still doubtful that WHATWG's "URL String" and RFC 3986 are entirely compatible, and that strikes me as a problem given that RFC 3986 is used everywhere else in the OpenAPI world. And you're even mixing WHATWG's "URL" definition with RFC 3986's relative reference resolution process, which is not necessarily identical to WHATWG's relative resolution process (I haven't really checked that part, though).

Since WHATWG's URL spec is a "Living Specification", this requirement may effectively change without notice, further introducing interoperability problems. "Living specifications" are a radically different model than how OpenAPI specs work.

If the desire here is purely to ensure that the URL is resolvable in accordance with its scheme, you can just say that along with referencing RFC 3986. I strongly believe that OpenAPI as a whole, across all of its specifications, should consistently use either IETF RFCs 3986/3987/6874 or WHATWG, and not mix the two. Preferably the IETF RFCs as that is what is already established.

With most URI/URL-related libraries you're lucky if you can find one that properly conforms to any specification. Finding one that will conform to WHATWG for some things and RFC 3986 for others is almost certainly impossible. Except perhaps by coincidence.

Also, you probably want an "and" between "URL," and "MAY be defined...".


regarding stepId:

Unique string to represent the step. The id SHOULD be unique amongst all steps described in the workflow. The stepId value is case-sensitive. Tools and libraries MAY use the stepId to uniquely identify a workflow step, therefore, it is RECOMMENDED to follow common programming naming conventions. SHOULD conform to the regular expression [A-Za-z0-9_-]+.

Is there any guidance that can be given around what happens when these SHOULDs are disregarded? Is arbitrary behavior allowed (that is what the current wording implies)? Is there any expectation about raising an error on duplicates? Is the SHOULD-ness of the regular expression to allow for names correlating to programming languages with different naming requirements?

A particularly tricky case is when the description is made of multiple documents, a stepId is referenced in a non-entry document, and both that non-entry document and the entry document contain a step with that stepId. This can get even more complex if the stepId also appears in additional documents within the description.


parameters:

A list of parameters to pass to an operation or workflow as referenced by operationId, operationRef, or workflowId. If a Reference Object is provided, it MUST link to parameters defined in components/parameters.

Would that be the Components Object in the entry document or the current document?


operationRef:

  1. Why is this a URI Template? Why is it different from OAS templating? I philosophically approve of using standard URI Templates, but you're using them with a Parameter Object including style which is designed for OAS templating (the Parameter Object explicitly references OAS templating, and makes no mention of RFC 6570 at all). And it is unclear how other features of URI Templates are intended to interact with the Parameter Object. I realize this was probably discussed somewhere in this repository, but it's surprising enough that I think it needs addressing in the specification. I would not understand how to implement this as it is.

  2. Why is this a template at all? It looks like the Parameter Object is otherwise intended to be used with the target Operation's URL template. Is it serving two purposes here, and if so, why? What happens if the same parameter is present in both the operationRef template and in the Paths template for the target Operation (in both the case where the two parameter definitions are compatible, and in the case where they are not?)


"Component Object" should be "Components Object" both for consistency with OAS 3.x and for consistency with the components keyword.


Reference Object value:

A value by default SHOULD override that of the referenced component. If the referenced object-type does not have a value field, then it has no effect.

What are the circumstances under which this SHOULD could reasonably be disregarded? The current language effectively allows implementations to arbitrarily disregard value, which would lead to interoperability problems. And dilute what it means to comply with the spec.


Media types: I think I mentioned this somewhere else, but I can't find it or any reply right now, so I'll risk repeating myself, with apologies:

Didn't we decide to avoid the vnd. tree when defining media types for OAS? Why not follow what the main OAS spec is doing here? This introduces a surprising inconsistency.

handrews commented 11 months ago

@ralfhandl

Which is only half of the truth: the fragment part - if present - apparently MUST be in the form of a JSON Pointer URI Fragment Identifier Representation.

Unless the reference points to a JSON Schema in which case plain name fragments are also supported. The validity of a given fragment syntax is determined entirely by the media type of the target representation, so it shouldn't need to be further specified (although an informative reminder probably wouldn't hurt).

ralfhandl commented 11 months ago

@handrews

Unless the reference points to a JSON Schema

The Reference Object section explicitly limits reference objects to pointers into/within Workflows Description:

A simple object to allow referencing other components in the Workflows Description.

Workflows descriptions have their own specific media type, so I think the specification of Workflows Description should explicitly define syntax and interpretation of fragments for these descriptions.

handrews commented 11 months ago

@ralfhandl thanks for catching that! However, the inputs field takes a JSON Schema draft 2020-12 object inline, with no restrictions documented, which would allow for plain name fragments within a Workflows document.

And yes, the media type registration must define fragment syntax and semantics in order for them to work (technically... we know it's quite common to disregard such things, but if we're registering media types we might as well do it right).

frankkilcommins commented 10 months ago

Section Parameter Object, fixed field target says

Can be useful for targeting specific request body part.

Seems that I MUST provide exactly one parameter that is in: body and does not have a target to have a body whose parts I can then modify with further parameters that have a target.

@ralfhandl - I tend to agree with you here. We should perhaps make that explicit in the wording (or wording + examples).

If target is present, MUST I then specify in: body? Or is there an unmentioned in value if bodyPart?

An example would be helpful.

In the spirit it was written, in:body was expected in such scenarios. There is no bodyPart value. The expectation would be that you'd target specific locations should you wish.

There are some examples (which could be improved) at:

frankkilcommins commented 10 months ago

Sections Success Action Object and Failure Action Object, field stepID say

The referenced stepId SHOULD be within the current workflow.

If the step is in a different workflow, how do I provide values for the inputs of that other workflow?

Section Step Object, field stepId says

The id SHOULD be unique amongst all steps described in the workflow.

So I can have the same step id within one workflow, and I definitely can have the same step id within different workflows.

Which of these identically named steps is to be executed? All of them? The first? A randomly chosen one?

As an implementor I'd prefer to have a "MUST" in these three sentences to unambiguously identify the step to go to.

@ralfhandl #105 created to address the most pressing part of this feedback

Section Step Object, field stepId says

The id SHOULD be unique amongst all steps described in the workflow.

Workflow prefixing can be used to enable the ability to locate the desired step should a situation arise where there is a need to have a step with the same id within two described workflows. I'm OK living with this one.

frankkilcommins commented 10 months ago

The Table of Contents has multiple issues:

  1. "Criterion Object" is listed before "Reference Object", and the corresponding sections have the opposite order
  2. Workflows Specification - Version 1.0.0 looks odd, which may be caused by "Workflows Specification" being heading level 1 and "Version 1.0.0" being heading level 4 - where are the two intermediate levels?
  3. Same for Definitions - Workflows Description: "Definitions" is heading level 2, "Workflows Description" is heading level 5
  4. Sections "Security Considerations" and "IANA Considerations" are not listed
  5. Appendix B is not listed

@ralfhandl #106 created to address this feedback

frankkilcommins commented 10 months ago

Section Reference Object, field $ref says

This MUST be in the form of a URI.

Which is only half of the truth: the fragment part - if present - apparently MUST be in the form of a JSON Pointer URI Fragment Identifier Representation.

Otherwise how is an implementation supposed to interpret the fragment part?

@ralfhandl I'll look to update the media type registrations to specify JSON Pointer as the fragment identifier mechanism, and add some additional text also into the section to clarify

107 created for this feedback

frankkilcommins commented 10 months ago

Section Criterion Object says

String comparisons SHOULD be case insensitive.

Please make this a MUST for either sensitive or insensitive, don't care much which.

Different behavior across different workflow execution engines would be a nuisance.

@ralfhandl #108 created for this feedback

frankkilcommins commented 10 months ago

Section Runtime Expressions, ABNF rules message-header-reference and message-payload-reference: are these leftovers from deleted text or "sneak previews" of things to come?

Either way I'd remove them from this specification version.

@ralfhandl a sneak preview indeed. Created #109 to tidy up

frankkilcommins commented 10 months ago

Section IANA COnsiderations, vnd. prefix:

  • This prefix is for vendors, whereas standards don't require a prefix.

I see OAI in the standards category, not in the vendors category.

@darrelmiller any comment here? I was following along from the old comment on media types for OAS but I tend to agree with @ralfhandl that we should not be flagging these as vendor specific.

I'm happy to submit a change request on the registrations based on your feedback.

handrews commented 10 months ago

@frankkilcommins that comment about vnd. media types long predates the more recent OAS media type draft RFC.

frankkilcommins commented 9 months ago

It would be valuable to have some examples and discussion around the onSuccess property. It would be good to move the Note about what happens when there are multiple success actions, up into the onSuccess property. It is the behavior of the array of success actions, not the success action itself that is being defined by the note.

https://github.com/OAI/sig-workflows/blob/main/versions/1.0.0.md#fixed-fields-6

@darrelmiller #118 created to move the notes up to the Step onSuccess and onFailure fixed fields.

frankkilcommins commented 9 months ago

Section Parameter Object says

There are five possible locations specified by the in field:

followed by a list of six values.

If the step has a workflowId I assume that all parameters must have in: workflow. In which case it would be easier to just omit the in field and remove the value workflow from the list, bringing it back to five possible values.

@ralfhandl Issue #119 created to review/harden based on this feedback

frankkilcommins commented 9 months ago

Section IANA COnsiderations, vnd. prefix:

  • This prefix is for vendors, whereas standards don't require a prefix.

I see OAI in the standards category, not in the vendors category.

@ralfhandl just closing off this issue comment. Please see here for current status/context.

frankkilcommins commented 9 months ago

Section Step Object, field operationId

If more than one (non workflowsSpec type) source description is defined within a Workflows Description, then the operationId specified MUST be prefixed with the source name to avoid ambiguity or potential clashes.

How prefixed? Just prepend without separator character? An example would be useful.

Same with field workflowId.

@ralfhandl issue #124 created to address this feedback