Open h14turbo opened 2 months ago
Can you try using API instead of UI? It's to verify if it's a Ollama connectivity problem or UI problem
I am getting an identical error via API calls
I have been trying to boot PGPT in ollama-api mode, with Ollama running on metal (macOS). I've booted everything using:
$ docker compose --profile=ollama-api up --build
After that, I tried running a test with CURL/Postman with the context enabled and I didn't manage to replicate it.
curl --location 'http://localhost:8001/v1/chat/completions' \
--header 'Content-Type: application/json' \
--data '{
"model": "llama3.1",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "What is deep learning?"
}
],
"stream": false,
"max_tokens": 20,
"use_context": true,
"context_filter": {
"docs_ids": ["test"]
}
}'
Can you give me more details about the problem? Can you try with ollama?
Okay, so I think I know the rough area that is causing the problem, I am using a load balancer between 2 Ollama instances using NGINX.
The main parts of the config are:
upstream ollama_load_balancer { server host.docker.internal:11436; server host.docker.internal:11435; }
}
location /ollama {
rewrite ^/ollama(/?)(.*)$ /$2 break;
proxy_pass http://ollama_load_balancer;
I am not sure why this is causing the connection not to work, as ingestion works using the same Ollama instances...
Mmm. Something is wrong with SSE then. Maybe, your proxy has disabled SSE events, buffer or any other related config. I was browsing and I found this post. Can you check it @h14turbo? https://stackoverflow.com/questions/13672743/eventsource-server-sent-events-through-nginx
Did you by any chance have ollama: keep_alive: modified from the default? I was experiencing the same issue (although not on Docker) and it seems to be that keep_alive value causing it. Spent like 4 hours troubleshooting it and then on a whim switched keep_alive back to 5m and it worked.
@mlbrnm I forgot to post a solution, but that is exactly what I had done! Changing back to 5m solved the problem for me too! I had it set to -1m, which seems to break everything.
@h14turbo Thanks for confirming. My installation is modified so I often can't be certain if it's something I caused or an actual bug. I'll post a new issue with all the info I have, I guess.
@mlbrnm This was my problem too, I had made so many modifications that it was hard to know what broke it finally. It was almost my last hope to change keep_alive back
ollama:
llm_model: llama3
embedding_model: mxbai-embed-large
api_base: http://localhost:11434
embedding_api_base: http://localhost:11434 # change if your embedding model runs on another ollama
keep_alive: 5m
When the keep_alive value is modified (for example, to 120m) I get the following error in the webUI when I try to use RAG mode. Search mode works.
08:45:47.680 [INFO] httpx - HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
Traceback (most recent call last):
File "C:\Users\<myaccount>\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-y2Cvz5QG-py3.11\Lib\site-packages\gradio\queueing.py", line 536, in process_events
response = await route_utils.call_process_api(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\<myaccount>\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-y2Cvz5QG-py3.11\Lib\site-packages\gradio\route_utils.py", line 322, in call_process_api
output = await app.get_blocks().process_api(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\<myaccount>\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-y2Cvz5QG-py3.11\Lib\site-packages\gradio\blocks.py", line 1935, in process_api
result = await self.call_function(
^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\<myaccount>\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-y2Cvz5QG-py3.11\Lib\site-packages\gradio\blocks.py", line 1532, in call_function
prediction = await utils.async_iteration(iterator)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\<myaccount>\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-y2Cvz5QG-py3.11\Lib\site-packages\gradio\utils.py", line 671, in async_iteration
return await iterator.__anext__()
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\<myaccount>\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-y2Cvz5QG-py3.11\Lib\site-packages\gradio\utils.py", line 776, in asyncgen_wrapper
response = await iterator.__anext__()
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\<myaccount>\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-y2Cvz5QG-py3.11\Lib\site-packages\gradio\chat_interface.py", line 653, in _stream_fn
first_response = await async_iteration(generator)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\<myaccount>\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-y2Cvz5QG-py3.11\Lib\site-packages\gradio\utils.py", line 671, in async_iteration
return await iterator.__anext__()
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\<myaccount>\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-y2Cvz5QG-py3.11\Lib\site-packages\gradio\utils.py", line 664, in __anext__
return await anyio.to_thread.run_sync(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\<myaccount>\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-y2Cvz5QG-py3.11\Lib\site-packages\anyio\to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\<myaccount>\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-y2Cvz5QG-py3.11\Lib\site-packages\anyio\_backends\_asyncio.py", line 2405, in run_sync_in_worker_thread
return await future
^^^^^^^^^^^^
File "C:\Users\<myaccount>\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-y2Cvz5QG-py3.11\Lib\site-packages\anyio\_backends\_asyncio.py", line 914, in run
result = context.run(func, *args)
^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\<myaccount>\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-y2Cvz5QG-py3.11\Lib\site-packages\gradio\utils.py", line 647, in run_sync_iterator_async
return next(iterator)
^^^^^^^^^^^^^^
File "D:\pgpt\private-gpt\private_gpt\ui\ui.py", line 190, in _chat
query_stream = self._chat_service.stream_chat(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\pgpt\private-gpt\private_gpt\server\chat\chat_service.py", line 175, in stream_chat
streaming_response = chat_engine.stream_chat(
^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\<myaccount>\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-y2Cvz5QG-py3.11\Lib\site-packages\llama_index\core\instrumentation\dispatcher.py", line 265, in wrapper
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\<myaccount>\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-y2Cvz5QG-py3.11\Lib\site-packages\llama_index\core\callbacks\utils.py", line 41, in wrapper
return func(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\<myaccount>\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-y2Cvz5QG-py3.11\Lib\site-packages\llama_index\core\chat_engine\context.py", line 247, in stream_chat
response = synthesizer.synthesize(message, nodes)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\<myaccount>\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-y2Cvz5QG-py3.11\Lib\site-packages\llama_index\core\instrumentation\dispatcher.py", line 265, in wrapper
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\<myaccount>\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-y2Cvz5QG-py3.11\Lib\site-packages\llama_index\core\response_synthesizers\base.py", line 241, in synthesize
response_str = self.get_response(
^^^^^^^^^^^^^^^^^^
File "C:\Users\<myaccount>\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-y2Cvz5QG-py3.11\Lib\site-packages\llama_index\core\instrumentation\dispatcher.py", line 265, in wrapper
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\<myaccount>\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-y2Cvz5QG-py3.11\Lib\site-packages\llama_index\core\response_synthesizers\compact_and_refine.py", line 43, in get_response
return super().get_response(
^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\<myaccount>\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-y2Cvz5QG-py3.11\Lib\site-packages\llama_index\core\instrumentation\dispatcher.py", line 265, in wrapper
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\<myaccount>\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-y2Cvz5QG-py3.11\Lib\site-packages\llama_index\core\response_synthesizers\refine.py", line 179, in get_response
response = self._give_response_single(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\<myaccount>\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-y2Cvz5QG-py3.11\Lib\site-packages\llama_index\core\response_synthesizers\refine.py", line 254, in _give_response_single
response = self._llm.stream(
^^^^^^^^^^^^^^^^^
File "C:\Users\<myaccount>\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-y2Cvz5QG-py3.11\Lib\site-packages\llama_index\core\instrumentation\dispatcher.py", line 265, in wrapper
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\<myaccount>\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-y2Cvz5QG-py3.11\Lib\site-packages\llama_index\core\llms\llm.py", line 622, in stream
chat_response = self.stream_chat(messages)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\pgpt\private-gpt\private_gpt\components\llm\llm_component.py", line 185, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\<myaccount>\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-y2Cvz5QG-py3.11\Lib\site-packages\llama_index\core\instrumentation\dispatcher.py", line 251, in wrapper
bound_args = inspect.signature(func).bind(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\<myaccount>\.pyenv\pyenv-win\versions\3.11.0b4\Lib\inspect.py", line 3204, in bind
return self._bind(args, kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\<myaccount>\.pyenv\pyenv-win\versions\3.11.0b4\Lib\inspect.py", line 3119, in _bind
raise TypeError(msg) from None
TypeError: missing a required argument: 'messages'
@jaluma I tried to figure out what was wrong (mostly looking in llm_component.py) but couldn't find a solution. This is what I believe to be the relevant section.
if (
ollama_settings.keep_alive
!= ollama_settings.model_fields["keep_alive"].default
):
# Modify Ollama methods to use the "keep_alive" field.
def add_keep_alive(func: Callable[..., Any]) -> Callable[..., Any]:
def wrapper(*args: Any, **kwargs: Any) -> Any:
kwargs["keep_alive"] = ollama_settings.keep_alive
return func(*args, **kwargs)
return wrapper
Ollama.chat = add_keep_alive(Ollama.chat) # type: ignore
Ollama.stream_chat = add_keep_alive(Ollama.stream_chat) # type: ignore
Ollama.complete = add_keep_alive(Ollama.complete) # type: ignore
Ollama.stream_complete = add_keep_alive(Ollama.stream_complete) # type: ignore
@mlbrnm can you open a PR with that fix to prevent it? I prefer remove keep_alive to have a problem with channel, connection or whatever
Pre-check
Description
When running the docker instance of privategpt with Ollama, I get an error saying: TypeError: missing a required argument: 'messages'
"Search" mode works, but any mode with the LLM called produces this error. I am using the normal gradio UI§§§ The full traceback is as follows:
private-gpt-ollama-1 | 18:00:31.961 [INFO ] uvicorn.access - 172.18.0.1:62074 - "POST /run/predict HTTP/1.1" 200 private-gpt-ollama-1 | 18:00:31.980 [INFO ] uvicorn.access - 172.18.0.1:55394 - "POST /queue/join HTTP/1.1" 200 private-gpt-ollama-1 | 18:00:31.982 [INFO ] uvicorn.access - 172.18.0.1:55394 - "GET /queue/data?session_hash=gjx9zkk6hbu HTTP/1.1" 200 private-gpt-ollama-1 | Traceback (most recent call last): private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/gradio/queueing.py", line 536, in process_events private-gpt-ollama-1 | response = await route_utils.call_process_api( private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/gradio/route_utils.py", line 276, in call_process_api private-gpt-ollama-1 | output = await app.get_blocks().process_api( private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/gradio/blocks.py", line 1923, in process_api private-gpt-ollama-1 | result = await self.call_function( private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/gradio/blocks.py", line 1520, in call_function private-gpt-ollama-1 | prediction = await utils.async_iteration(iterator) private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/gradio/utils.py", line 663, in async_iteration private-gpt-ollama-1 | return await iterator.anext() private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/gradio/utils.py", line 768, in asyncgen_wrapper private-gpt-ollama-1 | response = await iterator.anext() private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/gradio/chat_interface.py", line 652, in _stream_fn private-gpt-ollama-1 | first_response = await async_iteration(generator) private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/gradio/utils.py", line 663, in async_iteration private-gpt-ollama-1 | return await iterator.anext() private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/gradio/utils.py", line 656, in anext private-gpt-ollama-1 | return await anyio.to_thread.run_sync( private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/anyio/to_thread.py", line 56, in run_sync private-gpt-ollama-1 | return await get_async_backend().run_sync_in_worker_thread( private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 2177, in run_sync_in_worker_thread private-gpt-ollama-1 | return await future private-gpt-ollama-1 | ^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 859, in run private-gpt-ollama-1 | result = context.run(func, args) private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/gradio/utils.py", line 639, in run_sync_iterator_async private-gpt-ollama-1 | return next(iterator) private-gpt-ollama-1 | ^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/private_gpt/ui/ui.py", line 185, in _chat private-gpt-ollama-1 | query_stream = self._chat_service.stream_chat( private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/private_gpt/server/chat/chat_service.py", line 168, in stream_chat private-gpt-ollama-1 | streaming_response = chat_engine.stream_chat( private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 230, in wrapper private-gpt-ollama-1 | result = func(args, kwargs) private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/llama_index/core/callbacks/utils.py", line 41, in wrapper private-gpt-ollama-1 | return func(self, *args, *kwargs) private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/llama_index/core/chat_engine/context.py", line 210, in stream_chat private-gpt-ollama-1 | chat_stream=self._llm.stream_chat(all_messages), private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/private_gpt/components/llm/llm_component.py", line 183, in wrapper private-gpt-ollama-1 | return func(args, kwargs) private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 221, in wrapper private-gpt-ollama-1 | bound_args = inspect.signature(func).bind(*args, kwargs) private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/usr/local/lib/python3.11/inspect.py", line 3212, in bind private-gpt-ollama-1 | return self._bind(args, kwargs) private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/usr/local/lib/python3.11/inspect.py", line 3127, in _bind private-gpt-ollama-1 | raise TypeError(msg) from None private-gpt-ollama-1 | TypeError: missing a required argument: 'messages' private-gpt-ollama-1 | 18:01:15.212 [INFO ] uvicorn.access - 172.18.0.1:58670 - "POST /run/predict HTTP/1.1" 200 private-gpt-ollama-1 | 18:03:54.097 [INFO ] uvicorn.access - 172.18.0.1:61960 - "POST /queue/join HTTP/1.1" 200 private-gpt-ollama-1 | 18:03:54.099 [INFO ] uvicorn.access - 172.18.0.1:61960 - "GET /queue/data?session_hash=gjx9zkk6hbu HTTP/1.1" 200 private-gpt-ollama-1 | 18:03:54.122 [INFO ] private_gpt.ui.ui - Setting system prompt to: private-gpt-ollama-1 | 18:03:55.953 [INFO ] uvicorn.access - 172.18.0.1:61960 - "POST /run/predict HTTP/1.1" 200 private-gpt-ollama-1 | 18:03:55.971 [INFO ] uvicorn.access - 172.18.0.1:61960 - "POST /run/predict HTTP/1.1" 200 private-gpt-ollama-1 | 18:03:55.972 [INFO ] uvicorn.access - 172.18.0.1:59612 - "POST /run/predict HTTP/1.1" 200 private-gpt-ollama-1 | 18:03:55.987 [INFO ] uvicorn.access - 172.18.0.1:61960 - "POST /queue/join HTTP/1.1" 200 private-gpt-ollama-1 | 18:03:55.989 [INFO ] uvicorn.access - 172.18.0.1:61960 - "GET /queue/data?session_hash=gjx9zkk6hbu HTTP/1.1" 200 private-gpt-ollama-1 | 18:03:56.960 [INFO ] uvicorn.access - 172.18.0.1:59612 - "POST /run/predict HTTP/1.1" 200 private-gpt-ollama-1 | 18:08:30.668 [INFO ] uvicorn.access - 172.18.0.1:59372 - "POST /queue/join HTTP/1.1" 200 private-gpt-ollama-1 | 18:08:30.670 [INFO ] uvicorn.access - 172.18.0.1:59372 - "GET /queue/data?session_hash=gjx9zkk6hbu HTTP/1.1" 200 private-gpt-ollama-1 | 18:08:30.702 [INFO ] private_gpt.ui.ui - Setting system prompt to: You are an AI engine private-gpt-ollama-1 | private-gpt-ollama-1 | 18:08:32.171 [INFO ] uvicorn.access - 172.18.0.1:59372 - "POST /run/predict HTTP/1.1" 200 private-gpt-ollama-1 | 18:08:32.188 [INFO ] uvicorn.access - 172.18.0.1:59382 - "POST /run/predict HTTP/1.1" 200 private-gpt-ollama-1 | 18:08:32.189 [INFO ] uvicorn.access - 172.18.0.1:59372 - "POST /run/predict HTTP/1.1" 200 private-gpt-ollama-1 | 18:08:32.204 [INFO ] uvicorn.access - 172.18.0.1:59372 - "POST /queue/join HTTP/1.1" 200 private-gpt-ollama-1 | 18:08:32.207 [INFO ] uvicorn.access - 172.18.0.1:59372 - "GET /queue/data?session_hash=gjx9zkk6hbu HTTP/1.1" 200 private-gpt-ollama-1 | Traceback (most recent call last): private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/gradio/queueing.py", line 536, in process_events private-gpt-ollama-1 | response = await route_utils.call_process_api( private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/gradio/route_utils.py", line 276, in call_process_api private-gpt-ollama-1 | output = await app.get_blocks().process_api( private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/gradio/blocks.py", line 1923, in process_api private-gpt-ollama-1 | result = await self.call_function( private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/gradio/blocks.py", line 1520, in call_function private-gpt-ollama-1 | prediction = await utils.async_iteration(iterator) private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/gradio/utils.py", line 663, in async_iteration private-gpt-ollama-1 | return await iterator.anext() private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/gradio/utils.py", line 768, in asyncgen_wrapper private-gpt-ollama-1 | response = await iterator.anext() private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/gradio/chat_interface.py", line 652, in _stream_fn private-gpt-ollama-1 | first_response = await async_iteration(generator) private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/gradio/utils.py", line 663, in async_iteration private-gpt-ollama-1 | return await iterator.anext() private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/gradio/utils.py", line 656, in anext private-gpt-ollama-1 | return await anyio.to_thread.run_sync( private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/anyio/to_thread.py", line 56, in run_sync private-gpt-ollama-1 | return await get_async_backend().run_sync_in_worker_thread( private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 2177, in run_sync_in_worker_thread private-gpt-ollama-1 | return await future private-gpt-ollama-1 | ^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 859, in run private-gpt-ollama-1 | result = context.run(func, args) private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/gradio/utils.py", line 639, in run_sync_iterator_async private-gpt-ollama-1 | return next(iterator) private-gpt-ollama-1 | ^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/private_gpt/ui/ui.py", line 185, in _chat private-gpt-ollama-1 | query_stream = self._chat_service.stream_chat( private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/private_gpt/server/chat/chat_service.py", line 168, in stream_chat private-gpt-ollama-1 | streaming_response = chat_engine.stream_chat( private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 230, in wrapper private-gpt-ollama-1 | result = func(args, kwargs) private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/llama_index/core/callbacks/utils.py", line 41, in wrapper private-gpt-ollama-1 | return func(self, *args, kwargs) private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/llama_index/core/chat_engine/context.py", line 210, in stream_chat private-gpt-ollama-1 | chat_stream=self._llm.stream_chat(all_messages), private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/private_gpt/components/llm/llm_component.py", line 183, in wrapper private-gpt-ollama-1 | return func(*args, *kwargs) private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/home/worker/app/.venv/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 221, in wrapper private-gpt-ollama-1 | bound_args = inspect.signature(func).bind(args, kwargs) private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/usr/local/lib/python3.11/inspect.py", line 3212, in bind private-gpt-ollama-1 | return self._bind(args, kwargs) private-gpt-ollama-1 | ^^^^^^^^^^^^^^^^^^^^^^^^ private-gpt-ollama-1 | File "/usr/local/lib/python3.11/inspect.py", line 3127, in _bind private-gpt-ollama-1 | raise TypeError(msg) from None private-gpt-ollama-1 | TypeError: missing a required argument: 'messages'
Steps to Reproduce
I build the privategpt package and ran in docker. Modified the model to llama3.1:70b. Ollama is run outside of docker on 11434
Expected Behavior
Generate a response
Actual Behavior
Missing message argument
Environment
Ubuntu 20.04, RTX A6000 ADA
Additional Information
No response
Version
No response
Setup Checklist
NVIDIA GPU Setup Checklist
nvidia-smi
to verify).sudo docker run --rm --gpus all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi
)