I'm posting this here because I'm not sure if other people are experiencing this.
On first launch or after a period of inactivity (maybe around 10 minutes) when using the prompter the document retrieval is fast as usual, but formulating the chat response takes up to 10 minutes (so the ollama part, not the milvus part). Once it's "warmed up" the subsequent responses are very fast.
Please react with 👍 if you're experiencing this issue and 👎 if response times are fine for you. I want to gauge if this is an issue purely on my end or if it's common.
I'm posting this here because I'm not sure if other people are experiencing this.
On first launch or after a period of inactivity (maybe around 10 minutes) when using the prompter the document retrieval is fast as usual, but formulating the chat response takes up to 10 minutes (so the ollama part, not the milvus part). Once it's "warmed up" the subsequent responses are very fast.
Please react with 👍 if you're experiencing this issue and 👎 if response times are fine for you. I want to gauge if this is an issue purely on my end or if it's common.