OAI / OpenAPI-Specification

The OpenAPI Specification Repository
https://openapis.org
Apache License 2.0
28.56k stars 9.05k forks source link

Support application/json-seq and similar JSON-based sequential formats #3730

Open handrews opened 2 months ago

handrews commented 2 months ago

Splitting this out of issue #1576...

...which was originally about binary streaming but later drifted into a discussion of JSON streaming. Several real-world uses for JSON streaming were cited:

The binary streaming case just required a clarification and has been addressed in PR #3729 for 3.0.4, but adding proper support for a new media type will have to go into 3.2.0.

Also Discussed in https://github.com/OAI/OpenAPI-Specification/discussions/2707

Originally posted by **Skeeve** September 9, 2021 Is there a way to describe a [json-lines](https://jsonlines.org/) with OpenAPI? Besides the fact that there seems to be no mimetype for it yet, I'm wondering if it's possible to describe such a response. In theory my response could be an array of objects, but I received the question whether or not I could deliver as json-lines, meaning: Just the objects, one per line. Since I'm using OpenAPI to describe my API I'm puzzled as how to describe this response. I could simply define the response as being of type "string", but this is not very helpful for readers of my Api-spec. P.S. I already asked at [stackoverflow](https://stackoverflow.com/questions/69123769/openapi-how-to-define-json-lines-response) and was pointed to here.

Further thoughts

There are now several formats for JSON streaming

It should be clear from the lists of implementations, requests from several different folks in both issues and discussions, and the existence of significant derivative specs (GeoJSON is widely used within the geospatial data space) that this is a real use case with many applications.

AFAICT, the distinctions among the three formats are irrelevant to modeling their contents, and only involve the choice of delimiter, the allowability of blank lines/sequences (which are skipped during parsing regardless of the format), and the details of error handling. JSON Lines and NDJSON might actually merge.

From a data modeling perspective, I think we could support these in OAS 3.2 by noting that:

Tools that directly integrate and use JSON Schema implementations would need to handle the translation, but that's the main tooling impact. We could write the requirements around any sequential JSON format rather than tying it to any specific media type, since there seem to be multiple more-or-less equivalent approaches (I'm probably missing some).

NickG-NZ commented 1 month ago

Cohere (LLM provider) is another example of an API streaming JSON objects:

https://docs.cohere.com/docs/streaming