weaviate / Verba

Retrieval Augmented Generation (RAG) chatbot powered by Weaviate
BSD 3-Clause "New" or "Revised" License
5k stars 528 forks source link

ollama; WebSocket timeout not configurable and after a timeout subsequent requests fail #201

Open MaartenSmeets opened 1 month ago

MaartenSmeets commented 1 month ago

Description

During inference while receiving data from ollama, I received a WebSocket timeout exception and the connection was closed while the model/ollama was not done responding (it stopped mid sentence). After the connection was closed, I could not ask further questions and all API requests gave errors. This appears to be a bug.

Since I am running a model locally using ollama, it can be slow. It would be useful to have a configurable timeout so you can increase it when needed. This is a feature request. The error message I received is in the 'Additional context'. I'm running Verba from main, commit 8835192 (24-05-2024, it reports v1.0.2). I've used mixtral:8x22b (i7-13700K, 64Gb RAM, NVidia RTX4080 16Gb VRAM). N.b. I manually applied https://github.com/weaviate/Verba/pull/178 to use a different model for creating embeddings (nomic-embed-text).

Is this a bug or a feature?

Steps to Reproduce

1 create embeddings from documents 2 on the Chat tab type a question in the box indicating 'Ask Verba Anything' 3 wait until the timeout occurs 4 after the connection is closed, the API cannot be accessed anymore. when a new question is asked in the "Ask Verba Anything" box it replied with a red error message below the box saying: Failed to fetch from API: TypeError: Failed to fetch

Additional context

✘ WebSocket Error: ERROR: Exception in ASGI application Traceback (most recent call last): File "Verba/goldenverba/server/api.py", line 179, in websocket_generate_stream async for chunk in manager.generate_stream_answer( File "Verba/goldenverba/verba_manager.py", line 662, in generate_stream_answer async for result in self.generator_manager.generators[ File "Verba/goldenverba/components/generation/OllamaGenerator.py", line 49, in generate_stream async for line in response.content: File ".pyenv/versions/3.11.7/envs/verba/lib/python3.11/site-packages/aiohttp/streams.py", line 50, in anext rv = await self.read_func() ^^^^^^^^^^^^^^^^^^^^^^ File ".pyenv/versions/3.11.7/envs/verba/lib/python3.11/site-packages/aiohttp/streams.py", line 317, in readline return await self.readuntil() ^^^^^^^^^^^^^^^^^^^^^^ File ".pyenv/versions/3.11.7/envs/verba/lib/python3.11/site-packages/aiohttp/streams.py", line 351, in readuntil await self._wait("readuntil") File ".pyenv/versions/3.11.7/envs/verba/lib/python3.11/site-packages/aiohttp/streams.py", line 311, in _wait with self._timer: File ".pyenv/versions/3.11.7/envs/verba/lib/python3.11/site-packages/aiohttp/helpers.py", line 735, in exit raise asyncio.TimeoutError from None TimeoutError

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File ".pyenv/versions/3.11.7/envs/verba/lib/python3.11/site-packages/uvicorn/protocols/websockets/websockets_impl.py", line 240, in run_asgi result = await self.app(self.scope, self.asgi_receive, self.asgi_send) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File ".pyenv/versions/3.11.7/envs/verba/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py", line 69, in call return await self.app(scope, receive, send) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File ".pyenv/versions/3.11.7/envs/verba/lib/python3.11/site-packages/fastapi/applications.py", line 292, in call await super().call(scope, receive, send) File ".pyenv/versions/3.11.7/envs/verba/lib/python3.11/site-packages/starlette/applications.py", line 122, in call await self.middleware_stack(scope, receive, send) File ".pyenv/versions/3.11.7/envs/verba/lib/python3.11/site-packages/starlette/middleware/errors.py", line 149, in call await self.app(scope, receive, send) File ".pyenv/versions/3.11.7/envs/verba/lib/python3.11/site-packages/starlette/middleware/cors.py", line 75, in call await self.app(scope, receive, send) File ".pyenv/versions/3.11.7/envs/verba/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 79, in call raise exc File ".pyenv/versions/3.11.7/envs/verba/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 68, in call await self.app(scope, receive, sender) File ".pyenv/versions/3.11.7/envs/verba/lib/python3.11/site-packages/fastapi/middleware/asyncexitstack.py", line 20, in call raise e File ".pyenv/versions/3.11.7/envs/verba/lib/python3.11/site-packages/fastapi/middleware/asyncexitstack.py", line 17, in call await self.app(scope, receive, send) File ".pyenv/versions/3.11.7/envs/verba/lib/python3.11/site-packages/starlette/routing.py", line 718, in call await route.handle(scope, receive, send) File ".pyenv/versions/3.11.7/envs/verba/lib/python3.11/site-packages/starlette/routing.py", line 341, in handle await self.app(scope, receive, send) File ".pyenv/versions/3.11.7/envs/verba/lib/python3.11/site-packages/starlette/routing.py", line 82, in app await func(session) File ".pyenv/versions/3.11.7/envs/verba/lib/python3.11/site-packages/fastapi/routing.py", line 324, in app await dependant.call(values) File "Verba/goldenverba/server/api.py", line 193, in websocket_generate_stream await websocket.send_json( File ".pyenv/versions/3.11.7/envs/verba/lib/python3.11/site-packages/starlette/websockets.py", line 171, in send_json text = json.dumps(data, separators=(",", ":")) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File ".pyenv/versions/3.11.7/lib/python3.11/json/init.py", line 238, in dumps kw).encode(obj) ^^^^^^^^^^^ File ".pyenv/versions/3.11.7/lib/python3.11/json/encoder.py", line 200, in encode chunks = self.iterencode(o, _one_shot=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File ".pyenv/versions/3.11.7/lib/python3.11/json/encoder.py", line 258, in iterencode return _iterencode(o, 0) ^^^^^^^^^^^^^^^^^ File ".pyenv/versions/3.11.7/lib/python3.11/json/encoder.py", line 180, in default raise TypeError(f'Object of type {o.class.name} ' TypeError: Object of type TimeoutError is not JSON serializable INFO: connection closed

thomashacker commented 1 month ago

Thanks a lot for the information! We'll have a closer look

LucBruz commented 1 month ago

I am experiencing the same issue as described.

Context

Server Setup Information

.env file:


OLLAMA_URL=http://localhost:11434
OLLAMA_MODEL=phi3
OLLAMA_EMBED_MODEL=nomic-embed-text