[FEATURE] Long polling support for async queries

Is your feature request related to a problem? When submitting asynchronous queries, the current endpoints (submitting a query and getting the query status) return results immediately, including when those results are unavailable. This means that the best available solution for getting query results is to poll on some interval (possibly with something like exponential back-off), which introduces polling latency and also can send unnecessarily many requests.

What solution would you like? One alternative to regular polling is long polling, where the client opens a long-lived connection with the server, and the server only returns intermediary results after some timeout. The client can immediately re-open the connection when the timeout is reached, without needing a set interval. If the connection is otherwise terminated early (e.g. I/O issues), the client can again reopen the connection. This has a few advantages:

Less network requests, but lower polling latency. The server can return results as soon as they're available for a given connection, and the timeout for connections can be much higher than a polling interval.
Many requests with session reuse are quick, and with long polling it's possible to get results without needing multiple polls at all.
Simplified client logic, without fixed or dynamic polling intervals.

What alternatives have you considered? Another option for getting data from the async API quickly is Server-Sent Events. This involves creating a single persistent connection with the client that is updated on new updates (similar to WebSocket but unidirectional). This arguably would be a better approach, but I think its comparative implementation complexity wouldn't be worth the small improvement: clients are already using polling, and modifying the existing polling endpoint to have different response timings would be a lot less work than writing a new endpoint with a new protocol.

WebSockets are also an option, but generally seem more effective when there's a higher messaging frequency. It's possible to consider something like a WebSocket for querying sessions that enables sending and receiving multiple queries at once, but it may be out of scope.

Do you have any additional context? There are multiple implementations of async polling across the front-end, all with different polling intervals and edge case behaviors. We're currently working on centralizing this logic. One problem we found with forwarding the async queries was that we'd need both the frontend to poll our endpoint, and our server to poll the async query backend, leading to doubled polling latency. Long polling is a solution on the table to reduce the latency on our end, and it seems generally applicable on the upstream API as well.

opensearch-project / sql

[FEATURE] Long polling support for async queries #2776