[ML] Inference API better support for asynchronous tasks

Description

The inference API doesn't expose some of the same query parameters that the ml trained models API to provide management of asynchronous tasks. It would be helpful for users if the APIs returned a task id and/or allowed passing wait_for_completion=true|false.

Here's an example API:

PUT /_inference/sparse_embedding/my-elser-model?wait_for_completion=false

Without providing the asynchronous functionality, the requests tend to timeout.

We'll need to create a parent task that is associated with the download task and the deployment task.

elastic / elasticsearch

[ML] Inference API better support for asynchronous tasks #108258

Description