elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
1.34k stars 24.87k forks source link

[ML] Add retries on chunking queue full exceptions #117264

Open dan-rubinstein opened 1 day ago

dan-rubinstein commented 1 day ago

Description

Description

When the provided input for a chunked inference request breaks into many chunks, it's possible that it can exceed the queue size limit (defined either by the user or defaulted to 1000) on the ML node. We previously implemented a fix to "batch the chunks" to avoid hitting this queue limit. This fix waits for each batch of chunks to complete before sending the next one essentailly adding a queuing mechanism on top of the existing queue. Long term we'd like to replace this with a retry strategy on our calls to the queue that would backoff when a queue size limit is hit and try to push to it again after some period of time.

elasticsearchmachine commented 19 hours ago

Pinging @elastic/ml-core (Team:ML)