The inference API doesn't expose some of the same query parameters that the ml trained models API to provide management of asynchronous tasks. It would be helpful for users if the APIs returned a task id and/or allowed passing wait_for_completion=true|false.
Here's an example API:
PUT /_inference/sparse_embedding/my-elser-model?wait_for_completion=false
Without providing the asynchronous functionality, the requests tend to timeout.
We'll need to create a parent task that is associated with the download task and the deployment task.
Description
The inference API doesn't expose some of the same query parameters that the ml trained models API to provide management of asynchronous tasks. It would be helpful for users if the APIs returned a task id and/or allowed passing
wait_for_completion=true|false
.Here's an example API:
Without providing the asynchronous functionality, the requests tend to timeout.
We'll need to create a parent task that is associated with the download task and the deployment task.