Timeout error after approximately 10 seconds when running on macOS

m358807551 commented 4 months ago

Description

I am encountering a timeout error when running the following code on macOS. The error occurs approximately 10 seconds after the request is made. I would like to know if there is a way to configure or extend the timeout setting to handle longer response times.

Code to Reproduce

import traceback
import time
from ollama import chat

t1 = time.time()
try:
    # chat = Client(timeout=60).chat
    response = chat(model='llama3', messages=[
      {
        'role': 'user',
        'content': 'why the color of sky is blue?',
      },
    ])
    print(response['message']['content'])
except Exception as e:
    print("error", e)
    traceback.print_exc()

t2 = time.time()
print("cost ", t2-t1)

Error Message

error 
cost  10.008436918258667
Traceback (most recent call last):
  File "/var/folders/lz/9_t5yfyj6cg79jmhn26qrlcr0000gn/T/ipykernel_22684/1004213980.py", line 10, in <module>
    response = chat(model='llama3', messages=[
  File "/Users/myp/miniconda3/envs/py38/lib/python3.8/site-packages/ollama/_client.py", line 180, in chat
    return self._request_stream(
  File "/Users/myp/miniconda3/envs/py38/lib/python3.8/site-packages/ollama/_client.py", line 98, in _request_stream
    return self._stream(*args, **kwargs) if stream else self._request(*args, **kwargs).json()
  File "/Users/myp/miniconda3/envs/py38/lib/python3.8/site-packages/ollama/_client.py", line 74, in _request
    raise ResponseError(e.response.text, e.response.status_code) from None
ollama._types.ResponseError

ArnoldIOI commented 4 months ago

The only way to configure the timeout seems to be passing the timeout param here. A walkaround can be initialising a new Client with a timeout param, and use the new client to call apis.

Code to Reproduce

import traceback
import time
# from ollama import chat
from ollama import Client

t1 = time.time()
try:
    chat = Client(timeout=1).chat # new client with timeout 1s
    response = chat(model='gemma2:2b', messages=[
      {
        'role': 'user',
        'content': 'why the color of sky is blue?',
      },
    ])
    print(response['message']['content'])
except Exception as e:
    print("error", e)
    traceback.print_exc()

t2 = time.time()
print("cost ", t2-t1)

Output

httpx.ReadTimeout: timed out
cost  1.0301549434661865

ArnoldIOI commented 4 months ago

There’s an example in the README for Custom Client.

ollama / ollama-python