Modelling `wait_for_completion`

swallez commented 2 years ago

The problem

We have 16 endpoints that have a wait_for_completion parameter that instructs the ES cluster to run potentially lengthy operations in a blocking fashion or to run them asynchronously as a background task.

The response structure of these endpoints varies heavily depending on the value of this parameter. The default value for this parameter depends on the endpoint (most often true, but not always).

When wait_for_completion=true, the request blocks until the processing is finished, and the "normal" response body is then returned. When it's false, the request processing is started in the background on the ES cluster and some information about the background process is returned. The background response can vary, e.g. {"acknowledged": true} (snapshot-restore) or a task id { "task": "oTUltX4"} for ml-delete-job

The current state of the API specification is not optimal when it comes to supporting this feature, as having a single response structure requires to merge both normal and background properties in a single type, and consequently make them all optional.

Furthermore, as the ES documentation is generally sparse about the background response format, it is often forgotten in the spec and the focus is put on normal response properties, including making them required, which can cause issues (see https://github.com/elastic/elasticsearch-java/issues/201).

Semantics of `wait_for_completion`

From a semantic perspective, endpoints with a wait_for_completion are endpoints that are "background-capable". We can consider them as the combination of a foreground endpoint that returns the normal (foreground) response, and a background endpoint that starts a task in the background and returns information about the background process.

Those two endpoints have the exact same request path, parameters and body, with the exception of wait_for_completion, and only differ by their response types.

Some client generators may decide to have a single endpoint whose response is the a union of both foreground and background response types, while some others may want to distinguish the two endpoints to avoid exposing a union whose active variant depends on the value set for wait_for_completion (that parameter would then be removed from the request structure and set internally by the endpoint functions).

For example in the case of the Java client, that second form would be used and the snapshot-restore endpoint would result in two functions:

restore(RestoreRequest): RestoreResponse (foreground, internally sets wait_for_completion=true)
restoreAsync(RestoreRequest): AcknowledgedResponse (background, internally sets wait_for_completion=false)

In TypeScript however, the first form with a single function would be used:

restore(RestoreRequest): RestoreResponse | AcknowledgedResponse

Proposal: the `Backgroundable` type

To correctly capture the two response structures and allow client generators to be able to generate two related endpoints, we suggest to introduce the Backgroundable type for the response body of background-capable endpoints:

type Backgroundable<Foreground, Background> = Foreground | Background

For example, the response of snapshot-restore is currently:

export class Response {
  body: { snapshot: SnapshotRestore }
}

It would become:

export class Response {
  body: Backgroundable<RestoreResponse, AckowledgedResponse>
}

export type RestoreResponse {
  snapshot: SnapshotRestore
}

wait_for_completion in a request and Backgroundable in a response should go hand in hand: one cannot be present without the other in an endpoint definition, and this can be enforced by model validation.

Of course naming is hard and better naming than Backgroundable is welcome 😉

swallez commented 2 years ago

To provide a good user experience in strongly typed languages and avoid the union when neither wait_for_completion nor wait_for_completion_timeout are set, we can consider a Backgroundable endpoint as two distinct endpoints:

an always blocking endpoint, when the wait_for_... parameter is set to true (it will be hard-coded and not be exposed in the generated code). Its result will be the Foreground part.
an async (or potentially async) endpoint when the wait_for_... parameter is present:
- for wait_for_completion, the parameter will be hard-coded to false and the result will be the Background part
- for wait_for_completion_timeout, the parameter will be required and the result will be the union type.

This approach allows users who use the blocking endpoint to not bother with the union (we can assume it to be the majority of the cases) and will require to handle the union only for wait_for_completion_timeout endpoints (6 occurrences).

lmenezes commented 4 months ago

@swallez Hey there 👋🏽

is fixing this still on the radar?

steffenvan commented 2 months ago

@swallez Hey there 👋🏽 is fixing this still on the radar?

+1

swallez commented 2 months ago

Yes it is, but it has a number of ramifications if we need to split that endpoint as some downstream systems (e.g. docs) will need a single endpoint definition while others will need the split (e.g. client code generators, but only for certain languages)

elastic / elasticsearch-specification