Cinnamon / kotaemon

An open-source RAG-based tool for chatting with your documents.
https://cinnamon.github.io/kotaemon/
Apache License 2.0
12.11k stars 903 forks source link

[BUG] - <title> error when trying to upload a text or pdf file #293

Open agonzlop opened 5 days ago

agonzlop commented 5 days ago

Description

error when trying to upload a text or pdf file Screenshot_1

Using reader <kotaemon.loaders.pdf_loader.PDFThumbnailReader object at 0x000001B9BD6CF580> Page numbers: 72 Got 72 page thumbnails Adding documents to doc store Getting embeddings for 200 nodes RetryError[<Future at 0x1b9bfb639d0 state=finished raised AuthenticationError>] Traceback (most recent call last): File "D:\ia\koeman\kotaemon-app\install_dir\env\lib\site-packages\ktem\index\file\pipelines.py", line 724, in stream file_id, docs = yield from pipeline.stream( File "D:\ia\koeman\kotaemon-app\install_dir\env\lib\site-packages\ktem\index\file\pipelines.py", line 588, in stream yield from self.handle_docs(docs, file_id, file_path.name) File "D:\ia\koeman\kotaemon-app\install_dir\env\lib\site-packages\ktem\index\file\pipelines.py", line 400, in handle_docs yield from insert_chunks_to_vectorstore() File "D:\ia\koeman\kotaemon-app\install_dir\env\lib\site-packages\ktem\index\file\pipelines.py", line 385, in insert_chunks_to_vectorstore self.handle_chunks_vectorstore(chunks, file_id) File "D:\ia\koeman\kotaemon-app\install_dir\env\lib\site-packages\ktem\index\file\pipelines.py", line 427, in handle_chunks_vectorstore self.vector_indexing.add_to_vectorstore(chunks) File "D:\ia\koeman\kotaemon-app\install_dir\env\lib\site-packages\kotaemon\indices\vectorindex.py", line 92, in add_to_vectorstore embeddings = self.embedding(docs) File "D:\ia\koeman\kotaemon-app\install_dir\env\lib\site-packages\theflow\base.py", line 1097, in call raise e from None File "D:\ia\koeman\kotaemon-app\install_dir\env\lib\site-packages\theflow\base.py", line 1088, in call output = self.fl.exec(func, args, kwargs) File "D:\ia\koeman\kotaemon-app\install_dir\env\lib\site-packages\theflow\backends\base.py", line 151, in exec return run(*args, kwargs) File "D:\ia\koeman\kotaemon-app\install_dir\env\lib\site-packages\theflow\middleware.py", line 144, in call raise e from None File "D:\ia\koeman\kotaemon-app\install_dir\env\lib\site-packages\theflow\middleware.py", line 141, in call _output = self.next_call(*args, *kwargs) File "D:\ia\koeman\kotaemon-app\install_dir\env\lib\site-packages\theflow\middleware.py", line 117, in call return self.next_call(args, kwargs) File "D:\ia\koeman\kotaemon-app\install_dir\env\lib\site-packages\theflow\base.py", line 1017, in _runx return self.run(*args, kwargs) File "D:\ia\koeman\kotaemon-app\install_dir\env\lib\site-packages\kotaemon\embeddings\base.py", line 10, in run return self.invoke(text, args, kwargs) File "D:\ia\koeman\kotaemon-app\install_dir\env\lib\site-packages\kotaemon\embeddings\openai.py", line 104, in invoke resp = self.openairesponse(client, input=input, kwargs).dict() File "D:\ia\koeman\kotaemon-app\install_dir\env\lib\site-packages\tenacity__init__.py", line 289, in wrapped_f return self(f, args, kw) File "D:\ia\koeman\kotaemon-app\install_dir\env\lib\site-packages\tenacity__init.py", line 379, in call__ do = self.iter(retry_state=retry_state) File "D:\ia\koeman\kotaemon-app\install_dir\env\lib\site-packages\tenacity__init__.py", line 326, in iter raise retry_exc from fut.exception() tenacity.RetryError: RetryError[<Future at 0x1b9bfb639d0 state=finished raised AuthenticationError>] len(results)=3, len(file_list)=3 Screenshot_2

Reproduction steps

I try to upload files but I always get some error whether they are PDF files in English or Spanish

Screenshots

![DESCRIPTION](https://prnt.sc/KcUt1cfNJH8q)

Logs

No response

Browsers

Chrome, Other

OS

Windows

Additional information

No response

phv2312 commented 5 days ago

Can you provide more context ? Do you use ollama (local version) or some API from azure, etc.? Have you set it up correctly ? For more information you can refer to our guideline here. Hope it help

Lee-Ju-Yeong commented 3 days ago

I have the same issue.

❌ | README.md: RetryError[<Future at 0x7ff838254c10 state=finished raised AuthenticationError>]

Setup: I am running the application inside Docker and accessing it via http://host.docker.internal/. Vendor: I'm using OpenAI as the vendor, and I have not customized any settings – all values are set to the default. Authentication: The API key is set correctly in the environment variables, but I’m still encountering the RetryError related to authentication. Version: I’m not using Ollama (local version) or Azure API, just OpenAI’s default setup. Chat functionality: The chat functionality works without any issues, but the error occurs when trying to generate embeddings via the GPT API. I've already checked the documentation, but I’m unsure if there are additional setup steps when running inside Docker.

Any help on this would be appreciated. I’ll review the guidelines you linked as well. Thank you!

phv2312 commented 3 days ago

Hi @Lee-Ju-Yeong In the Resources > Emebdding tab, we have the function for testing connection. Can you try it & see what's the output ? Previously It's recommend to set up all credentials via the UI. (The .env will be used for the 1st time and later credentials will be stored on DB). Hope it help