InfoSecInnovations / concierge

Repo for Concierge AI dev work
Apache License 2.0
181 stars 31 forks source link

Slow response to first prompt or after a while of inactivity #17

Open sebovzeoueb opened 8 months ago

sebovzeoueb commented 8 months ago

I'm posting this here because I'm not sure if other people are experiencing this.

On first launch or after a period of inactivity (maybe around 10 minutes) when using the prompter the document retrieval is fast as usual, but formulating the chat response takes up to 10 minutes (so the ollama part, not the milvus part). Once it's "warmed up" the subsequent responses are very fast.

Please react with 👍 if you're experiencing this issue and 👎 if response times are fine for you. I want to gauge if this is an issue purely on my end or if it's common.