Closed khlee369 closed 2 weeks ago
@khlee369 Thanks for the bug report. We will investigate why this isn't working
It seems that the issue is related to the default setting of DEFAULT_REQUEST_TIMEOUT = 30.0
in llama_index.llms.ollama
. When loading a model with ollama and performing inference, if the process takes longer than 30 seconds, the coroutine fails to await and the request is immediately terminated.
To temporarily resolve this issue, you can either modify DEFAULT_REQUEST_TIMEOUT = 30.0
in llama_index.llms.ollama.base
(e.g., change it to 3000.0), or set the request_timeout in the config.yaml file under the generator node.
modules:
- module_type: llama_index_llm
llm: ollama
model: qwen2:72b
temperature: 0.7
request_timeout: 3000
batch: 1
Ref
Close this issue since this is not the code problem. You have to set request_timeout more higher value.
I want to use a specific
model:tag
provided by ollama.Ref
However, if i use the ollama specific tag as shown below, i got an error.
error message
``` [08/22/24 12:35:30] ERROR [__init__.py:73] >> Unexpected exception __init__.py:73 ╭───────────────────────────── Traceback (most recent call last) .... ReadTimeout The above exception was the direct cause of the following exception: ╭───────────────────────────── Traceback (most recent call last) ──────────────────────────────╮ │ /root/workspace/corporate-llm/autorag/main.py:27 inautorag works when i use untagged model