[BUG] TypeError: missing a required argument: 'messages'

h14turbo commented 2 months ago

Pre-check

[X] I have searched the existing issues and none cover this bug.

Description

When running the docker instance of privategpt with Ollama, I get an error saying: TypeError: missing a required argument: 'messages'

"Search" mode works, but any mode with the LLM called produces this error. I am using the normal gradio UI§§§ The full traceback is as follows:

private-gpt-ollama-1 | 18:00:31.961 [INFO ] uvicorn.access - 172.18.0.1:62074 - "POST /run/predict HTTP/1.1" 200 private-gpt-ollama-1 | 18:00:31.980 [INFO ] uvicorn.access - 172.18.0.1:55394 - "POST /queue/join HTTP/1.1" 200 private-gpt-ollama-1 | 18:00:31.982 [INFO ] uvicorn.access - 172.18.0.1:55394 - "GET /queue/data?session_hash=gjx9zkk6hbu HTTP/1.1" 200 private-gpt-ollama-1 | Traceback (most recent call last): private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/gradio/queueing.py", line 536, in process_events private-gpt-ollama-1 | response = await route_utils.call_process_api( private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/gradio/route_utils.py", line 276, in call_process_api private-gpt-ollama-1 | output = await app.get_blocks().process_api( private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/gradio/blocks.py", line 1923, in process_api private-gpt-ollama-1 | result = await self.call_function( private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/gradio/blocks.py", line 1520, in call_function private-gpt-ollama-1 | prediction = await utils.async_iteration(iterator) private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/gradio/utils.py", line 663, in async_iteration private-gpt-ollama-1 | return await iterator.anext() private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/gradio/utils.py", line 768, in asyncgen_wrapper private-gpt-ollama-1 | response = await iterator.anext() private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/gradio/chat_interface.py", line 652, in _stream_fn private-gpt-ollama-1 | first_response = await async_iteration(generator) private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/gradio/utils.py", line 663, in async_iteration private-gpt-ollama-1 | return await iterator.anext() private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/gradio/utils.py", line 656, in anext private-gpt-ollama-1 | return await anyio.to_thread.run_sync( private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/anyio/to_thread.py", line 56, in run_sync private-gpt-ollama-1 | return await get_async_backend().run_sync_in_worker_thread( private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 2177, in run_sync_in_worker_thread private-gpt-ollama-1 | return await future private-gpt-ollama-1 | ^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 859, in run private-gpt-ollama-1 | result = context.run(func, args) private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/gradio/utils.py", line 639, in run_sync_iterator_async private-gpt-ollama-1 | return next(iterator) private-gpt-ollama-1 | ^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/private_gpt/ui/ui.py", line 185, in _chat private-gpt-ollama-1 | query_stream = self._chat_service.stream_chat( private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/private_gpt/server/chat/chat_service.py", line 168, in stream_chat private-gpt-ollama-1 | streaming_response = chat_engine.stream_chat( private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 230, in wrapper private-gpt-ollama-1 | result = func(args, kwargs) private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/llama_index/core/callbacks/utils.py", line 41, in wrapper private-gpt-ollama-1 | return func(self, *args, *kwargs) private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/llama_index/core/chat_engine/context.py", line 210, in stream_chat private-gpt-ollama-1 | chat_stream=self._llm.stream_chat(all_messages), private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/private_gpt/components/llm/llm_component.py", line 183, in wrapper private-gpt-ollama-1 | return func(args, kwargs) private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 221, in wrapper private-gpt-ollama-1 | bound_args = inspect.signature(func).bind(*args, kwargs) private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/usr/local/lib/python3.11/inspect.py", line 3212, in bind private-gpt-ollama-1 | return self._bind(args, kwargs) private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/usr/local/lib/python3.11/inspect.py", line 3127, in _bind private-gpt-ollama-1 | raise TypeError(msg) from None private-gpt-ollama-1 | TypeError: missing a required argument: 'messages' private-gpt-ollama-1 | 18:01:15.212 [INFO ] uvicorn.access - 172.18.0.1:58670 - "POST /run/predict HTTP/1.1" 200 private-gpt-ollama-1 | 18:03:54.097 [INFO ] uvicorn.access - 172.18.0.1:61960 - "POST /queue/join HTTP/1.1" 200 private-gpt-ollama-1 | 18:03:54.099 [INFO ] uvicorn.access - 172.18.0.1:61960 - "GET /queue/data?session_hash=gjx9zkk6hbu HTTP/1.1" 200 private-gpt-ollama-1 | 18:03:54.122 [INFO ] private_gpt.ui.ui - Setting system prompt to: private-gpt-ollama-1 | 18:03:55.953 [INFO ] uvicorn.access - 172.18.0.1:61960 - "POST /run/predict HTTP/1.1" 200 private-gpt-ollama-1 | 18:03:55.971 [INFO ] uvicorn.access - 172.18.0.1:61960 - "POST /run/predict HTTP/1.1" 200 private-gpt-ollama-1 | 18:03:55.972 [INFO ] uvicorn.access - 172.18.0.1:59612 - "POST /run/predict HTTP/1.1" 200 private-gpt-ollama-1 | 18:03:55.987 [INFO ] uvicorn.access - 172.18.0.1:61960 - "POST /queue/join HTTP/1.1" 200 private-gpt-ollama-1 | 18:03:55.989 [INFO ] uvicorn.access - 172.18.0.1:61960 - "GET /queue/data?session_hash=gjx9zkk6hbu HTTP/1.1" 200 private-gpt-ollama-1 | 18:03:56.960 [INFO ] uvicorn.access - 172.18.0.1:59612 - "POST /run/predict HTTP/1.1" 200 private-gpt-ollama-1 | 18:08:30.668 [INFO ] uvicorn.access - 172.18.0.1:59372 - "POST /queue/join HTTP/1.1" 200 private-gpt-ollama-1 | 18:08:30.670 [INFO ] uvicorn.access - 172.18.0.1:59372 - "GET /queue/data?session_hash=gjx9zkk6hbu HTTP/1.1" 200 private-gpt-ollama-1 | 18:08:30.702 [INFO ] private_gpt.ui.ui - Setting system prompt to: You are an AI engine private-gpt-ollama-1 | private-gpt-ollama-1 | 18:08:32.171 [INFO ] uvicorn.access - 172.18.0.1:59372 - "POST /run/predict HTTP/1.1" 200 private-gpt-ollama-1 | 18:08:32.188 [INFO ] uvicorn.access - 172.18.0.1:59382 - "POST /run/predict HTTP/1.1" 200 private-gpt-ollama-1 | 18:08:32.189 [INFO ] uvicorn.access - 172.18.0.1:59372 - "POST /run/predict HTTP/1.1" 200 private-gpt-ollama-1 | 18:08:32.204 [INFO ] uvicorn.access - 172.18.0.1:59372 - "POST /queue/join HTTP/1.1" 200 private-gpt-ollama-1 | 18:08:32.207 [INFO ] uvicorn.access - 172.18.0.1:59372 - "GET /queue/data?session_hash=gjx9zkk6hbu HTTP/1.1" 200 private-gpt-ollama-1 | Traceback (most recent call last): private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/gradio/queueing.py", line 536, in process_events private-gpt-ollama-1 | response = await route_utils.call_process_api( private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/gradio/route_utils.py", line 276, in call_process_api private-gpt-ollama-1 | output = await app.get_blocks().process_api( private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/gradio/blocks.py", line 1923, in process_api private-gpt-ollama-1 | result = await self.call_function( private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/gradio/blocks.py", line 1520, in call_function private-gpt-ollama-1 | prediction = await utils.async_iteration(iterator) private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/gradio/utils.py", line 663, in async_iteration private-gpt-ollama-1 | return await iterator.anext() private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/gradio/utils.py", line 768, in asyncgen_wrapper private-gpt-ollama-1 | response = await iterator.anext() private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/gradio/chat_interface.py", line 652, in _stream_fn private-gpt-ollama-1 | first_response = await async_iteration(generator) private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/gradio/utils.py", line 663, in async_iteration private-gpt-ollama-1 | return await iterator.anext() private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/gradio/utils.py", line 656, in anext private-gpt-ollama-1 | return await anyio.to_thread.run_sync( private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/anyio/to_thread.py", line 56, in run_sync private-gpt-ollama-1 | return await get_async_backend().run_sync_in_worker_thread( private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 2177, in run_sync_in_worker_thread private-gpt-ollama-1 | return await future private-gpt-ollama-1 | ^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 859, in run private-gpt-ollama-1 | result = context.run(func, args) private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/gradio/utils.py", line 639, in run_sync_iterator_async private-gpt-ollama-1 | return next(iterator) private-gpt-ollama-1 | ^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/private_gpt/ui/ui.py", line 185, in _chat private-gpt-ollama-1 | query_stream = self._chat_service.stream_chat( private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/private_gpt/server/chat/chat_service.py", line 168, in stream_chat private-gpt-ollama-1 | streaming_response = chat_engine.stream_chat( private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 230, in wrapper private-gpt-ollama-1 | result = func(args, kwargs) private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/llama_index/core/callbacks/utils.py", line 41, in wrapper private-gpt-ollama-1 | return func(self, *args, kwargs) private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/llama_index/core/chat_engine/context.py", line 210, in stream_chat private-gpt-ollama-1 | chat_stream=self._llm.stream_chat(all_messages), private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/private_gpt/components/llm/llm_component.py", line 183, in wrapper private-gpt-ollama-1 | return func(*args, *kwargs) private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 221, in wrapper private-gpt-ollama-1 | bound_args = inspect.signature(func).bind(args, kwargs) private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/usr/local/lib/python3.11/inspect.py", line 3212, in bind private-gpt-ollama-1 | return self._bind(args, kwargs) private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/usr/local/lib/python3.11/inspect.py", line 3127, in _bind private-gpt-ollama-1 | raise TypeError(msg) from None private-gpt-ollama-1 | TypeError: missing a required argument: 'messages'

Steps to Reproduce

I build the privategpt package and ran in docker. Modified the model to llama3.1:70b. Ollama is run outside of docker on 11434

Expected Behavior

Generate a response

Actual Behavior

Missing message argument

Environment

Ubuntu 20.04, RTX A6000 ADA

Additional Information

No response

Version

No response

Setup Checklist

[X] Confirm that you have followed the installation instructions in the project’s documentation.
[X] Check that you are using the latest version of the project.
[X] Verify disk space availability for model storage and data processing.
[X] Ensure that you have the necessary permissions to run the project.

NVIDIA GPU Setup Checklist

[X] Check that the all CUDA dependencies are installed and are compatible with your GPU (refer to CUDA's documentation)
[X] Ensure an NVIDIA GPU is installed and recognized by the system (run nvidia-smi to verify).
[X] Ensure proper permissions are set for accessing GPU resources.
[X] Docker users - Verify that the NVIDIA Container Toolkit is configured correctly (e.g. run sudo docker run --rm --gpus all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi)

jaluma commented 1 month ago

Can you try using API instead of UI? It's to verify if it's a Ollama connectivity problem or UI problem

h14turbo commented 1 month ago

I am getting an identical error via API calls

jaluma commented 1 month ago

I have been trying to boot PGPT in ollama-api mode, with Ollama running on metal (macOS). I've booted everything using:

$ docker compose --profile=ollama-api up --build

After that, I tried running a test with CURL/Postman with the context enabled and I didn't manage to replicate it.

curl --location 'http://localhost:8001/v1/chat/completions' \
--header 'Content-Type: application/json' \
--data '{
  "model": "llama3.1",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "What is deep learning?"
    }
  ],
  "stream": false,
  "max_tokens": 20,
  "use_context": true,
  "context_filter": {
    "docs_ids": ["test"]
  }
}'

Can you give me more details about the problem? Can you try with ollama?

h14turbo commented 1 month ago

Okay, so I think I know the rough area that is causing the problem, I am using a load balancer between 2 Ollama instances using NGINX.

The main parts of the config are:

upstream ollama_load_balancer { server host.docker.internal:11436; server host.docker.internal:11435; }

 }
    location /ollama {
        rewrite ^/ollama(/?)(.*)$ /$2 break;
        proxy_pass http://ollama_load_balancer;

I am not sure why this is causing the connection not to work, as ingestion works using the same Ollama instances...

jaluma commented 1 month ago

Mmm. Something is wrong with SSE then. Maybe, your proxy has disabled SSE events, buffer or any other related config. I was browsing and I found this post. Can you check it @h14turbo? https://stackoverflow.com/questions/13672743/eventsource-server-sent-events-through-nginx

mlbrnm commented 1 week ago

Did you by any chance have ollama: keep_alive: modified from the default? I was experiencing the same issue (although not on Docker) and it seems to be that keep_alive value causing it. Spent like 4 hours troubleshooting it and then on a whim switched keep_alive back to 5m and it worked.

h14turbo commented 1 week ago

@mlbrnm I forgot to post a solution, but that is exactly what I had done! Changing back to 5m solved the problem for me too! I had it set to -1m, which seems to break everything.

mlbrnm commented 1 week ago

@h14turbo Thanks for confirming. My installation is modified so I often can't be certain if it's something I caused or an actual bug. I'll post a new issue with all the info I have, I guess.

h14turbo commented 1 week ago

@mlbrnm This was my problem too, I had made so many modifications that it was hard to know what broke it finally. It was almost my last hope to change keep_alive back

mlbrnm commented 1 week ago

ollama:
  llm_model: llama3
  embedding_model: mxbai-embed-large
  api_base: http://localhost:11434
  embedding_api_base: http://localhost:11434  # change if your embedding model runs on another ollama
  keep_alive: 5m

When the keep_alive value is modified (for example, to 120m) I get the following error in the webUI when I try to use RAG mode. Search mode works.

08:45:47.680 [INFO]    httpx - HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
Traceback (most recent call last):
  File "C:\Users\<myaccount>\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-y2Cvz5QG-py3.11\Lib\site-packages\gradio\queueing.py", line 536, in process_events
    response = await route_utils.call_process_api(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\<myaccount>\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-y2Cvz5QG-py3.11\Lib\site-packages\gradio\route_utils.py", line 322, in call_process_api
    output = await app.get_blocks().process_api(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\<myaccount>\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-y2Cvz5QG-py3.11\Lib\site-packages\gradio\blocks.py", line 1935, in process_api
    result = await self.call_function(
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\<myaccount>\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-y2Cvz5QG-py3.11\Lib\site-packages\gradio\blocks.py", line 1532, in call_function
    prediction = await utils.async_iteration(iterator)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\<myaccount>\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-y2Cvz5QG-py3.11\Lib\site-packages\gradio\utils.py", line 671, in async_iteration
    return await iterator.__anext__()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\<myaccount>\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-y2Cvz5QG-py3.11\Lib\site-packages\gradio\utils.py", line 776, in asyncgen_wrapper
    response = await iterator.__anext__()
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\<myaccount>\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-y2Cvz5QG-py3.11\Lib\site-packages\gradio\chat_interface.py", line 653, in _stream_fn
    first_response = await async_iteration(generator)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\<myaccount>\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-y2Cvz5QG-py3.11\Lib\site-packages\gradio\utils.py", line 671, in async_iteration
    return await iterator.__anext__()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\<myaccount>\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-y2Cvz5QG-py3.11\Lib\site-packages\gradio\utils.py", line 664, in __anext__
    return await anyio.to_thread.run_sync(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\<myaccount>\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-y2Cvz5QG-py3.11\Lib\site-packages\anyio\to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\<myaccount>\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-y2Cvz5QG-py3.11\Lib\site-packages\anyio\_backends\_asyncio.py", line 2405, in run_sync_in_worker_thread
    return await future
           ^^^^^^^^^^^^
  File "C:\Users\<myaccount>\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-y2Cvz5QG-py3.11\Lib\site-packages\anyio\_backends\_asyncio.py", line 914, in run
    result = context.run(func, *args)
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\<myaccount>\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-y2Cvz5QG-py3.11\Lib\site-packages\gradio\utils.py", line 647, in run_sync_iterator_async
    return next(iterator)
           ^^^^^^^^^^^^^^
  File "D:\pgpt\private-gpt\private_gpt\ui\ui.py", line 190, in _chat
    query_stream = self._chat_service.stream_chat(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\pgpt\private-gpt\private_gpt\server\chat\chat_service.py", line 175, in stream_chat
    streaming_response = chat_engine.stream_chat(
                         ^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\<myaccount>\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-y2Cvz5QG-py3.11\Lib\site-packages\llama_index\core\instrumentation\dispatcher.py", line 265, in wrapper
    result = func(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\<myaccount>\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-y2Cvz5QG-py3.11\Lib\site-packages\llama_index\core\callbacks\utils.py", line 41, in wrapper
    return func(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\<myaccount>\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-y2Cvz5QG-py3.11\Lib\site-packages\llama_index\core\chat_engine\context.py", line 247, in stream_chat
    response = synthesizer.synthesize(message, nodes)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\<myaccount>\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-y2Cvz5QG-py3.11\Lib\site-packages\llama_index\core\instrumentation\dispatcher.py", line 265, in wrapper
    result = func(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\<myaccount>\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-y2Cvz5QG-py3.11\Lib\site-packages\llama_index\core\response_synthesizers\base.py", line 241, in synthesize
    response_str = self.get_response(
                   ^^^^^^^^^^^^^^^^^^
  File "C:\Users\<myaccount>\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-y2Cvz5QG-py3.11\Lib\site-packages\llama_index\core\instrumentation\dispatcher.py", line 265, in wrapper
    result = func(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\<myaccount>\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-y2Cvz5QG-py3.11\Lib\site-packages\llama_index\core\response_synthesizers\compact_and_refine.py", line 43, in get_response
    return super().get_response(
           ^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\<myaccount>\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-y2Cvz5QG-py3.11\Lib\site-packages\llama_index\core\instrumentation\dispatcher.py", line 265, in wrapper
    result = func(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\<myaccount>\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-y2Cvz5QG-py3.11\Lib\site-packages\llama_index\core\response_synthesizers\refine.py", line 179, in get_response
    response = self._give_response_single(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\<myaccount>\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-y2Cvz5QG-py3.11\Lib\site-packages\llama_index\core\response_synthesizers\refine.py", line 254, in _give_response_single
    response = self._llm.stream(
               ^^^^^^^^^^^^^^^^^
  File "C:\Users\<myaccount>\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-y2Cvz5QG-py3.11\Lib\site-packages\llama_index\core\instrumentation\dispatcher.py", line 265, in wrapper
    result = func(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\<myaccount>\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-y2Cvz5QG-py3.11\Lib\site-packages\llama_index\core\llms\llm.py", line 622, in stream
    chat_response = self.stream_chat(messages)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\pgpt\private-gpt\private_gpt\components\llm\llm_component.py", line 185, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\<myaccount>\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-y2Cvz5QG-py3.11\Lib\site-packages\llama_index\core\instrumentation\dispatcher.py", line 251, in wrapper
    bound_args = inspect.signature(func).bind(*args, **kwargs)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\<myaccount>\.pyenv\pyenv-win\versions\3.11.0b4\Lib\inspect.py", line 3204, in bind
    return self._bind(args, kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\<myaccount>\.pyenv\pyenv-win\versions\3.11.0b4\Lib\inspect.py", line 3119, in _bind
    raise TypeError(msg) from None
TypeError: missing a required argument: 'messages'

@jaluma I tried to figure out what was wrong (mostly looking in llm_component.py) but couldn't find a solution. This is what I believe to be the relevant section.

if (
                    ollama_settings.keep_alive
                    != ollama_settings.model_fields["keep_alive"].default
                ):
                    # Modify Ollama methods to use the "keep_alive" field.
                    def add_keep_alive(func: Callable[..., Any]) -> Callable[..., Any]:
                        def wrapper(*args: Any, **kwargs: Any) -> Any:
                            kwargs["keep_alive"] = ollama_settings.keep_alive
                            return func(*args, **kwargs)

                        return wrapper

                    Ollama.chat = add_keep_alive(Ollama.chat)  # type: ignore
                    Ollama.stream_chat = add_keep_alive(Ollama.stream_chat)  # type: ignore
                    Ollama.complete = add_keep_alive(Ollama.complete)  # type: ignore
                    Ollama.stream_complete = add_keep_alive(Ollama.stream_complete)  # type: ignore

jaluma commented 1 week ago

@mlbrnm can you open a PR with that fix to prevent it? I prefer remove keep_alive to have a problem with channel, connection or whatever

zylon-ai / private-gpt