Cinnamon / kotaemon

An open-source RAG-based tool for chatting with your documents.
https://cinnamon.github.io/kotaemon/
Apache License 2.0
12.66k stars 943 forks source link

[BUG] - Gets stuck in "Thinking" upon asking questions about the document uploaded #327

Open ravaidly opened 5 days ago

ravaidly commented 5 days ago

Description

Windows Conda Environment.

Local LLM : mistral embeddings : nomic

image

(kotaemon) PS C:\Users\NIVENKA\Desktop\koteamon\kotaemon> python .\app.py C:\Users\NIVENKA\AppData\Local\miniconda3\envs\kotaemon\lib\site-packages\langchain_core_api\deprecation.py:119: LangChainDeprecationWarning:

The class CohereEmbeddings was deprecated in LangChain 0.0.30 and will be removed in 0.3.0. An updated version of the class exists in the langchain-cohere package and should be used instead. To use it run pip install -U langchain-cohere and import as from langchain_cohere import CohereEmbeddings.

User "admin" already exists Setting up quick upload event Running on local URL: http://127.0.0.1:7860

To create a public link, set share=True in launch(). User-id: None, can see public conversations: False User-id: 1, can see public conversations: True len(results)=1, len(file_list)=1 len(results)=0, len(file_list)=1 User-id: 1, can see public conversations: True Session reasoning type None Session LLM None Reasoning class <class 'ktem.reasoning.simple.FullQAPipeline'> Reasoning state {'app': {'regen': False}, 'pipeline': {}} Thinking ... Retrievers [DocumentRetrievalPipeline(DS=<kotaemon.storages.docstores.lancedb.LanceDBDocumentStore object at 0x000001FB22122C50>, FSPath=WindowsPath('C:/Users/NIVENKA/Desktop/koteamon/kotaemon/ktem_app_data/user_data/files/index_1'), Index=<class 'ktem.index.file.index.IndexTable'>, Source=<class 'ktem.index.file.index.Source'>, VS=<kotaemon.storages.vectorstores.chroma.ChromaVectorStore object at 0x000001FB22123070>, get_extra_table=False, llm_scorer=None, mmr=False, rerankers=[CohereReranking(cohere_api_key='', model_name='rerank-multilingual-v2.0')], retrieval_mode='hybrid', top_k=20, userid=1), GraphRAGRetrieverPipeline(DS=<theflow.base.unset object at 0x000001FB50F0D6C0>, FSPath=<theflow.base.unset object at 0x000001FB50F0D6C0>, Index=<class 'ktem.index.file.index.IndexTable'>, Source=<theflow.base.unset object at 0x000001FB50F0D6C0>, VS=<theflow.base.unset_ object at 0x000001FB50F0D6C0>, file_ids=[], userid=<theflow.base.unset object at 0x000001FB50F0D6C0>)] searching in doc_ids [] Got 0 retrieved documents len (original) 0 Got 0 images Trying LLM streaming Got 0 cited docs User-id: 1, can see public conversations: True Overriding with default loaders use_quick_index_mode False reader_mode default Using reader <kotaemon.loaders.pdf_loader.PDFThumbnailReader object at 0x000001FB2476A4A0> use_quick_index_mode True reader_mode default Using reader <kotaemon.loaders.pdf_loader.PDFThumbnailReader object at 0x000001FB24768A00> C:\Users\NIVENKA\AppData\Local\miniconda3\envs\kotaemon\lib\site-packages\pypdf_crypt_providers_cryptography.py:32: CryptographyDeprecationWarning:

ARC4 has been moved to cryptography.hazmat.decrepit.ciphers.algorithms.ARC4 and will be removed from this module in 48.0.0.

Page numbers: 51 Got 51 page thumbnails Adding documents to doc store Running embedding in thread Getting embeddings for 104 nodes indexing step took 1.7613699436187744 len(results)=2, len(file_list)=2 Session reasoning type None Session LLM None Reasoning class <class 'ktem.reasoning.simple.FullQAPipeline'> Reasoning state {'app': {'regen': False}, 'pipeline': {}} Thinking ... Retrievers [DocumentRetrievalPipeline(DS=<kotaemon.storages.docstores.lancedb.LanceDBDocumentStore object at 0x000001FB22122C50>, FSPath=WindowsPath('C:/Users/NIVENKA/Desktop/koteamon/kotaemon/ktem_app_data/user_data/files/index_1'), Index=<class 'ktem.index.file.index.IndexTable'>, Source=<class 'ktem.index.file.index.Source'>, VS=<kotaemon.storages.vectorstores.chroma.ChromaVectorStore object at 0x000001FB22123070>, get_extra_table=False, llm_scorer=None, mmr=False, rerankers=[CohereReranking(cohere_api_key='', model_name='rerank-multilingual-v2.0')], retrieval_mode='hybrid', top_k=20, userid=1), GraphRAGRetrieverPipeline(DS=<theflow.base.unset object at 0x000001FB50F0D6C0>, FSPath=<theflow.base.unset object at 0x000001FB50F0D6C0>, Index=<class 'ktem.index.file.index.IndexTable'>, Source=<theflow.base.unset object at 0x000001FB50F0D6C0>, VS=<theflow.base.unset_ object at 0x000001FB50F0D6C0>, file_ids=[], userid=<theflow.base.unset object at 0x000001FB50F0D6C0>)] searching in doc_ids ['5a3c8739-f6b9-41a9-9a38-bf0369029809', 'da2c7bcb-f471-40d3-ba20-e77da66a02e5'] retrieval_kwargs: dict_keys(['do_extend', 'scope', 'filters'])

Reproduction steps

1. Go to '...'
2. Click on '....'
3. Scroll down to '....'
4. See error

Screenshots

![DESCRIPTION](LINK.png)

Logs

No response

Browsers

No response

OS

No response

Additional information

No response

taprosoft commented 4 days ago

Hi please check out installation from latest release, or alternatively try online installation method here https://cinnamon.github.io/kotaemon/online_install/

taprosoft commented 4 days ago

Also, follow this guide to setup local models from Ollama correctly https://github.com/Cinnamon/kotaemon/blob/main/docs/local_model.md#use-local-models-for-rag

ravaidly commented 4 days ago

Hi @taprosoft , i believe the installation is correct. i am able to connect with the Ollama local model and get a response from the model. It gets stuck only when i upload the document and try to make a conversation. Here are my settings :

image image image

Please let me know if there is any other settings needs to be done or it is wrongly configured.

taprosoft commented 4 days ago

Seems the configuration is correct. This issue might happen if your ollama API calling a big model and since the QA prompt has pretty long context it can cause out-of-memory issue or crash :D. Can you check if you can chat normally without any uploaded files with Ollama :D

ravaidly commented 3 days ago

Hi @taprosoft , i can chat normally with Ollama through the kotaemon server, before uploading files. Is there any logs files or API calls which i need to trace to see if the call happens to Ollama or it is an issue with embeddings? kindly suggest or let me know the next steps for debugging.

taprosoft commented 2 days ago

Hi @ravaidly, basically you can try to send a really long message :D If it is due to memory error, any long message can cause this issue. You can try page like https://platform.openai.com/tokenizer to count your tokens. Token count around 8k are a good test. Also please double check that in Retrieval Settings you set Use LLM relevant score to Off (reduce load on Ollama server). Hope this help.