Cinnamon / kotaemon

An open-source RAG-based tool for chatting with your documents.
https://cinnamon.github.io/kotaemon/
Apache License 2.0
15.77k stars 1.21k forks source link

[BUG] - FileNotFoundError: [WinError 3] #158

Open yaozeyang90 opened 2 months ago

yaozeyang90 commented 2 months ago

Description

When I use graphrag, I encounter the following error: FileNotFoundError: [WinError 3] The system cannot find the path specified.: 'C:\Users\zeyan\kotaemon\ktem_app_data\user_data\files\graphrag\23e6f795-bfa6-4a4c-b3db-370a14f0df72\output'. I have successfully used rag, but I am unable to use graphrag. graphrag is capable of uploading files and vectorizing them.

Reproduction steps

1. Go to '...'
2. Click on '....'
3. Scroll down to '....'
4. See error

Screenshots

No response

Logs

Thinking ...
Retrievers [DocumentRetrievalPipeline(DS=<kotaemon.storages.docstores.lancedb.LanceDBDocumentStore object at 0x000001EC9A48FD10>, FSPath=WindowsPath('C:/Users/zeyan/kotaemon/ktem_app_data/user_data/files/index_1'), Index=<class 'ktem.index.file.index.IndexTable'>, Source=<class 'ktem.index.file.index.Source'>, VS=<kotaemon.storages.vectorstores.chroma.ChromaVectorStore object at 0x000001EC9A48F8D0>, get_extra_table=False, llm_scorer=LLMTrulensScoring(concurrent=True, normalize=10, prompt_template=<kotaemon.llms.prompts.template.PromptTemplate object at 0x000001EC9E16EE10>, system_prompt_template=<kotaemon.llms.prompts.template.PromptTemplate object at 0x000001EC9E16D3D0>, top_k=3, user_prompt_template=<kotaemon.llms.prompts.template.PromptTemplate object at 0x000001EC9E16D650>), mmr=False, rerankers=[CohereReranking(cohere_api_key='', model_name='rerank-multilingual-v2.0')], retrieval_mode='hybrid', top_k=10, user_id=1), GraphRAGRetrieverPipeline(DS=<theflow.base.unset_ object at 0x000001ECD966A7D0>, FSPath=<theflow.base.unset_ object at 0x000001ECD966A7D0>, Index=<class 'ktem.index.file.index.IndexTable'>, Source=<theflow.base.unset_ object at 0x000001ECD966A7D0>, VS=<theflow.base.unset_ object at 0x000001ECD966A7D0>, file_ids=['f98a7724-d70c-4417-b539-bd850e74338f'], user_id=<theflow.base.unset_ object at 0x000001ECD966A7D0>)]
searching in doc_ids []
Traceback (most recent call last):
  File "C:\Users\zeyan\AppData\Local\Programs\Python\Python311\Lib\site-packages\gradio\queueing.py", line 575, in process_events
    response = await route_utils.call_process_api(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\zeyan\AppData\Local\Programs\Python\Python311\Lib\site-packages\gradio\route_utils.py", line 276, in call_process_api
    output = await app.get_blocks().process_api(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\zeyan\AppData\Local\Programs\Python\Python311\Lib\site-packages\gradio\blocks.py", line 1923, in process_api
    result = await self.call_function(
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\zeyan\AppData\Local\Programs\Python\Python311\Lib\site-packages\gradio\blocks.py", line 1520, in call_function
    prediction = await utils.async_iteration(iterator)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\zeyan\AppData\Local\Programs\Python\Python311\Lib\site-packages\gradio\utils.py", line 663, in async_iteration
    return await iterator.__anext__()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\zeyan\AppData\Local\Programs\Python\Python311\Lib\site-packages\gradio\utils.py", line 656, in __anext__
    return await anyio.to_thread.run_sync(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\zeyan\AppData\Local\Programs\Python\Python311\Lib\site-packages\anyio\to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\zeyan\AppData\Local\Programs\Python\Python311\Lib\site-packages\anyio\_backends\_asyncio.py", line 2177, in run_sync_in_worker_thread
    return await future
           ^^^^^^^^^^^^
  File "C:\Users\zeyan\AppData\Local\Programs\Python\Python311\Lib\site-packages\anyio\_backends\_asyncio.py", line 859, in run
    result = context.run(func, *args)
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\zeyan\AppData\Local\Programs\Python\Python311\Lib\site-packages\gradio\utils.py", line 639, in run_sync_iterator_async
    return next(iterator)
           ^^^^^^^^^^^^^^
  File "C:\Users\zeyan\AppData\Local\Programs\Python\Python311\Lib\site-packages\gradio\utils.py", line 801, in gen_wrapper
    response = next(iterator)
               ^^^^^^^^^^^^^^
  File "C:\Users\zeyan\kotaemon\libs\ktem\ktem\pages\chat\__init__.py", line 804, in chat_fn
    for response in pipeline.stream(chat_input, conversation_id, chat_history):
  File "C:\Users\zeyan\kotaemon\libs\ktem\ktem\reasoning\simple.py", line 660, in stream
    docs, infos = self.retrieve(message, history)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\zeyan\kotaemon\libs\ktem\ktem\reasoning\simple.py", line 488, in retrieve
    retriever_docs = retriever_node(text=query)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\zeyan\AppData\Local\Programs\Python\Python311\Lib\site-packages\theflow\base.py", line 1097, in __call__
    raise e from None
  File "C:\Users\zeyan\AppData\Local\Programs\Python\Python311\Lib\site-packages\theflow\base.py", line 1088, in __call__
    output = self.fl.exec(func, args, kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\zeyan\AppData\Local\Programs\Python\Python311\Lib\site-packages\theflow\backends\base.py", line 151, in exec
    return run(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\zeyan\AppData\Local\Programs\Python\Python311\Lib\site-packages\theflow\middleware.py", line 144, in __call__
    raise e from None
  File "C:\Users\zeyan\AppData\Local\Programs\Python\Python311\Lib\site-packages\theflow\middleware.py", line 141, in __call__
    _output = self.next_call(*args, **kwargs)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\zeyan\AppData\Local\Programs\Python\Python311\Lib\site-packages\theflow\middleware.py", line 117, in __call__
    return self.next_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\zeyan\AppData\Local\Programs\Python\Python311\Lib\site-packages\theflow\base.py", line 1017, in _runx
    return self.run(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\zeyan\kotaemon\libs\ktem\ktem\index\file\graph\pipelines.py", line 321, in run
    context_builder = self._build_graph_search()
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\zeyan\kotaemon\libs\ktem\ktem\index\file\graph\pipelines.py", line 181, in _build_graph_search
    list(output_path.iterdir()), key=lambda x: x.stem, reverse=True
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\zeyan\AppData\Local\Programs\Python\Python311\Lib\pathlib.py", line 931, in iterdir
    for name in os.listdir(self):
                ^^^^^^^^^^^^^^^^
FileNotFoundError: [WinError 3] 系统找不到指定的路径。: 'C:\\Users\\zeyan\\kotaemon\\ktem_app_data\\user_data\\files\\graphrag\\23e6f795-bfa6-4a4c-b3db-370a14f0df72\\output'

Browsers

No response

OS

Windows

Additional information

No response

taprosoft commented 2 months ago

Please check possible duplicated issue https://github.com/Cinnamon/kotaemon/issues/142#issuecomment-2315798693

Pedestrian-yu commented 1 month ago

This is not a dupicated issue . While we use GraphRAG to chat with llm which provided by openai,the software meet trouble. I find the DocumentRetrievalPipeline might direct to a wrong fold FSPath=PosixPath('/Users/sience/Desktop/kotaemon/ktem_app_data/user_data/files/index_1'). so it can't get data.(I have successfully completed the GraphRAG collection)

graphragsuccess terminal1 terminal2