OAI / OpenAPI-Specification

The OpenAPI Specification Repository
https://openapis.org
Apache License 2.0
29.02k stars 9.07k forks source link

Formats for arrays #3572

Open HenryGessau opened 9 months ago

HenryGessau commented 9 months ago

Across various use cases for arrays, we find that in some situations the order is sensitive (for example, when the array functions as a vector, tuple, series, etc.), while in other situations the array functions as a set and is not order sensitive. The three formats proposed here are intended to cover all variants of array use cases.

Proposed formats

The base type for these formats is array.

set

A collection of unique, unordered items. The position of an item in the array is not significant.

Order need not be preserved in serialization, deserialization, compression, and transmission.

This format implies uniqueItems: true and is incompatible with uniqueItems: false.

multiset

A collection of unordered items allowing for duplicates. The position of an item in the array is not significant.

Order need not be preserved in serialization, deserialization, compression, and transmission.

This format implies uniqueItems: false and is incompatible with uniqueItems: true.

sequence

An ordered collection of items. The order is defined by each item's index in the array, and not by comparing the items' values.

Order must be preserved in serialization, deserialization, compression, and transmission.

This format is compatible with any value for uniqueItems.

Alternative names for this format were considered, but were deemed less suitable:

Notes

These formats do not alter the definition of item uniqueness, which comes from the JSON Schema definition of instance equality. Note that uniqueItems does not consider semantic equivalence. For example, date-time equivalency is not considered here:

ExampleTimes:
  type: array
  uniqueItems: true
  items:
    type: string
    format: date-time
  example:
    - '1996-12-19T16:39:57-08:00'
    - '1996-12-20T00:39:57Z'

The set and multiset formats are incompatible with conflicting uniqueItems values, as described above. Existing formats have similar incompatibilities. For example, string-based formats like date and email are incompatible with conflicting pattern values.

The multiset format is included for completeness. The data type it represents does not appear to be commonly used.

The sequence and set formats represent data types that are very common in real-world API use. For example, the Terraform types include:

  • list (or tuple): a sequence of values, like ["us-west-1a", "us-west-1c"]. Identify elements in a list with consecutive whole numbers, starting with zero.
  • set: a collection of unique values that do not have any secondary identifiers or ordering.

Having the proposed formats enables Terraform providers to correctly make use of the above Terraform types.

Examples

set

Album:
  properties:
    name:
      type: string
    genres:
      type: array
      format: set
      items:
        type: string
      example: [jazz, rock]  # equivalent to [rock, jazz]

multiset

Survey:
  properties:
    question:
      type: string
      example: How often do you exercise?
    collected_responses:
      type: array
      format: multiset
      items:
        type: string
      example:
      - "Daily"
      - "Once or twice a week"
      - "Daily"
      - "Every month"
      example2:  # equivalent to the above example
      - "Daily"
      - "Daily"
      - "Every month"
      - "Once or twice a week"

sequence

NetworkConfiguration:
  properties:
    dns_servers:
      type: array
      uniqueItems: true
      format: sequence  # Order is significant. The first server will be queried first.
                        # The next server will be used only when the first server fails.
      items:
        type: string
        format: ipv4
      example:
      - 192.168.0.3
      - 192.168.0.2
lornajane commented 8 months ago

Is the expectation that by adding these to the OpenAPI specification, all tools must implement all of them? That's quite a big impact and the benefits beyond what we already have in arrays isn't clear to me (yet).

I'd be interested to see examples of tools that already support this type of thing and how that's working out before we tried to make it official.

meem commented 8 months ago

@lornajane Per https://spec.openapis.org/registry/format/, tools are not required to implement formats in the registry.

nfroidure commented 8 months ago

Would be a nice way to solve that long lasting issue : https://github.com/OAI/OpenAPI-Specification/issues/883

🤞

hkosova commented 6 months ago

@HenryGessau I think your proposed sequence format is already covered by the prefixItems keyword available in OpenAPI 3.1.

Example from here:

# 2-element tuple; the 1st element is a number, the 2nd element is a string

type: array
prefixItems:

  # The 1st item
  - type: integer
    description: Description of the 1st item

  # The 2nd item
  - type: string
    description: Description of the 2nd item

  # Define the 3rd etc. items if needed
  # ...

# The total number of items in this tuple - in case it needs to be limited
minItems: 2
maxItems: 2
additionalItems: false   # can be omitted if `maxItems` is specified
LasneF commented 2 weeks ago

🤔isn't more a requirement toward JsonSchema rather than OAS ?

handrews commented 2 weeks ago

@LasneF it has previously been on the TDC agenda to determine whether we should close JSON Schema-related requests (I am in favor of doing so), but we have never managed to make an actual decision on it.

It is actually part of our issue tracking the need to document issue closure criteria: