apocas / restai

RESTai is an AIaaS (AI as a Service) open-source platform. Built on top of LlamaIndex & Langchain. Supports any public LLM supported by LlamaIndex and any local LLM suported by Ollama/vLLM/etc. Precise embeddings usage and tuning. Built-in image generation (Dall-E, SD, Flux) and dynamic loading generators.
https://apocas.github.io/restai/
Apache License 2.0
388 stars 75 forks source link

Unicode problem with document #97

Open fenio opened 1 month ago

fenio commented 1 month ago

Hey, I've started experimenting with RestAI and I picked some random PDF from my city's officials docs and it uploaded fine but when I'm trying to view it I got this in webui: image

And the following info in logs:

restai-1 | UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb1 in position 19: invalid start byte restai-1 | INFO: 172.27.219.64:44730 - "GET /projects/testing/embeddings/source/d25pb3NlayBvIGRvYnJvd29sbrEgemFtaWFuZSBtaWVzemthbmlhLnBkZg%3D%3D HTTP/1.1" 500 Internal Server Error restai-1 | ERROR: Exception in ASGI application restai-1 | Traceback (most recent call last): restai-1 | File "/app/.venv/lib/python3.11/site-packages/uvicorn/protocols/http/httptools_impl.py", line 399, in run_asgi restai-1 | result = await app( # type: ignore[func-returns-value] restai-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ restai-1 | File "/app/.venv/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py", line 70, in __call__ restai-1 | return await self.app(scope, receive, send) restai-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ restai-1 | File "/app/.venv/lib/python3.11/site-packages/fastapi/applications.py", line 1054, in __call__ restai-1 | await super().__call__(scope, receive, send) restai-1 | File "/app/.venv/lib/python3.11/site-packages/starlette/applications.py", line 123, in __call__ restai-1 | await self.middleware_stack(scope, receive, send) restai-1 | File "/app/.venv/lib/python3.11/site-packages/starlette/middleware/errors.py", line 186, in __call__ restai-1 | raise exc restai-1 | File "/app/.venv/lib/python3.11/site-packages/starlette/middleware/errors.py", line 164, in __call__ restai-1 | await self.app(scope, receive, _send) restai-1 | File "/app/.venv/lib/python3.11/site-packages/starlette/middleware/cors.py", line 93, in __call__ restai-1 | await self.simple_response(scope, receive, send, request_headers=headers) restai-1 | File "/app/.venv/lib/python3.11/site-packages/starlette/middleware/cors.py", line 148, in simple_response restai-1 | await self.app(scope, receive, send) restai-1 | File "/app/.venv/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 65, in __call__ restai-1 | await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send) restai-1 | File "/app/.venv/lib/python3.11/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app restai-1 | raise exc restai-1 | File "/app/.venv/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app restai-1 | await app(scope, receive, sender) restai-1 | File "/app/.venv/lib/python3.11/site-packages/starlette/routing.py", line 756, in __call__ restai-1 | await self.middleware_stack(scope, receive, send) restai-1 | File "/app/.venv/lib/python3.11/site-packages/starlette/routing.py", line 776, in app restai-1 | await route.handle(scope, receive, send) restai-1 | File "/app/.venv/lib/python3.11/site-packages/starlette/routing.py", line 297, in handle restai-1 | await self.app(scope, receive, send) restai-1 | File "/app/.venv/lib/python3.11/site-packages/starlette/routing.py", line 77, in app restai-1 | await wrap_app_handling_exceptions(app, request)(scope, receive, send) restai-1 | File "/app/.venv/lib/python3.11/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app restai-1 | raise exc restai-1 | File "/app/.venv/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app restai-1 | await app(scope, receive, sender) restai-1 | File "/app/.venv/lib/python3.11/site-packages/starlette/routing.py", line 72, in app restai-1 | response = await func(request) restai-1 | ^^^^^^^^^^^^^^^^^^^ restai-1 | File "/app/.venv/lib/python3.11/site-packages/fastapi/routing.py", line 278, in app restai-1 | raw_response = await run_endpoint_function( restai-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ restai-1 | File "/app/.venv/lib/python3.11/site-packages/fastapi/routing.py", line 191, in run_endpoint_function restai-1 | return await dependant.call(**values) restai-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ restai-1 | File "/app/app/routers/projects.py", line 388, in get_embedding restai-1 | docs = project.vector.find_source(base64.b64decode(source).decode('utf-8'))

I was testing it locally but same thing happens on public demo instance. http://ai.ince.pt/admin/projects/my_project

So file is there but if it's needed also here I can provide it.

noverd commented 1 month ago

can you post your pdf here?

fenio commented 1 month ago

wniosek o dobrowoln± zamiane mieszkania.pdf