Feature Request: Adjustable response timeout per GenAI connector

cp-elastic commented 11 months ago

Describe the feature: The default GenAI response timeout appears to be around 60 seconds. This timeout should be adjustable per connector to account for varying models and responsiveness of the API. Ideally this would be an additional field to set during the connector configuration workflow in Kibana.

Describe a specific use case for the feature: In this specific case, I am self-hosting a large language model with an OpenAI conformant API for development purposes on a bare metal server with 24 CPU cores and 96 GB RAM. The model typically sends a response within 2 minutes, which is obviously beyond the default timeout. As adoption of LLMs and capabilities of the AI Assistant expand, this will help organizations with privacy concerns that are hosting their own LLMs on commodity hardware.

elasticmachine commented 11 months ago

Pinging @elastic/response-ops (Team:ResponseOps)

cp-elastic commented 9 months ago

I've tested the new changes in Kibana 8.11.0, and I'm running into less timeout errors now when using my local LLM. I'd still like to see this be configurable, if possible, so that users can adjust timeouts per OpenAI connector to account for different models and LLM settings.

securix-rog commented 4 months ago

For testing different local models and optimize the performance of them, I would really appreciate this feature!

Danouchka commented 1 month ago

+1

pathoge commented 3 weeks ago

+1

cnasikas commented 3 days ago

cc @dgieselaar @stephmilovic @mikecote

dgieselaar commented 3 days ago

@cnasikas will defer to others, timeouts are not really an issue for the Observability AI Assistant because we use streaming.

elastic / kibana

Feature Request: Adjustable response timeout per GenAI connector #166561