Closed SharanyaSarkar closed 9 months ago
The code that you've shared is taking a long time (more than 3 mins) to retrieve the results, how to optimize the response time ?
Main response time comes from LLM execution locally with Ollama. To run it quick, you need to run Ollama on GPU machine or Apple M1 processor.
The code that you've shared is taking a long time (more than 3 mins) to retrieve the results, how to optimize the response time ?