TheAiSingularity / graphrag-local-ollama

Local models support for Microsoft's graphrag using ollama (llama3, mistral, gemma2 phi3)- LLM & Embedding extraction
MIT License
778 stars 116 forks source link

Generating with num_threads on:sys:1: RuntimeWarning: coroutine 'to_thread' was never awaited RuntimeWarning: Enable tracemalloc to get the object allocation traceback #66

Closed worstkid92 closed 2 months ago

worstkid92 commented 2 months ago

Traceback: ├── create_base_text_units └── create_base_extracted_entities └── Verb entity_extract ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1% 0:00:05 0:00:37 All tasks cancelled. Exiting... Traceback (most recent call last): File "", line 198, in _run_module_as_main File "", line 88, in _run_code File "/mnt/codes/graphrag/graphrag-local-ollama/graphrag/index/main.py", line 76, in index_cli( File "/mnt/codes/graphrag/graphrag-local-ollama/graphrag/index/cli.py", line 161, in index_cli _run_workflow_async() File "/mnt/codes/graphrag/graphrag-local-ollama/graphrag/index/cli.py", line 154, in _run_workflow_async runner.run(execute()) File "/mnt/anaconda3_install/envs/graphrag-ollama-local/lib/python3.11/asyncio/runners.py", line 118, in run return self._loop.run_until_complete(task) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "uvloop/loop.pyx", line 1517, in uvloop.loop.Loop.run_until_complete File "/mnt/codes/graphrag/graphrag-local-ollama/graphrag/index/cli.py", line 123, in execute async for output in run_pipeline_with_config( File "/mnt/codes/graphrag/graphrag-local-ollama/graphrag/index/run.py", line 154, in run_pipeline_with_config async for table in run_pipeline( File "/mnt/codes/graphrag/graphrag-local-ollama/graphrag/index/run.py", line 323, in run_pipeline result = await workflow.run(context, callbacks) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/mnt/anaconda3_install/envs/graphrag-ollama-local/lib/python3.11/site-packages/datashaper/workflow/workflow.py", line 369, in run timing = await self._execute_verb(node, context, callbacks) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/mnt/anaconda3_install/envs/graphrag-ollama-local/lib/python3.11/site-packages/datashaper/workflow/workflow.py", line 415, in _execute_verb result = await result ^^^^^^^^^^^^ File "/mnt/codes/graphrag/graphrag-local-ollama/graphrag/index/verbs/entities/extraction/entity_extract.py", line 161, in entity_extract results = await derive_from_rows( ^^^^^^^^^^^^^^^^^^^^^^^ File "/mnt/anaconda3_install/envs/graphrag-ollama-local/lib/python3.11/site-packages/datashaper/execution/derive_from_rows.py", line 33, in derive_from_rows return await derive_from_rows_asyncio_threads( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/mnt/anaconda3_install/envs/graphrag-ollama-local/lib/python3.11/site-packages/datashaper/execution/derive_from_rows_asyncio_threads.py", line 40, in derive_from_rows_asyncio_threads return await derive_from_rows_base(input, transform, callbacks, gather) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/mnt/anaconda3_install/envs/graphrag-ollama-local/lib/python3.11/site-packages/datashaper/execution/derive_from_rows_base.py", line 49, in derive_from_rows_base result = await gather(execute) ^^^^^^^^^^^^^^^^^^^^^ File "/mnt/anaconda3_install/envs/graphrag-ollama-local/lib/python3.11/site-packages/datashaper/execution/derive_from_rows_asyncio_threads.py", line 38, in gather return await asyncio.gather(*[execute_task(task) for task in tasks]) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/mnt/anaconda3_install/envs/graphrag-ollama-local/lib/python3.11/site-packages/datashaper/execution/derive_from_rows_asyncio_threads.py", line 33, in execute_task async with semaphore: File "/mnt/anaconda3_install/envs/graphrag-ollama-local/lib/python3.11/asyncio/locks.py", line 15, in aenter await self.acquire() File "/mnt/anaconda3_install/envs/graphrag-ollama-local/lib/python3.11/asyncio/locks.py", line 387, in acquire await fut asyncio.exceptions.CancelledError sys:1: RuntimeWarning: coroutine 'to_thread' was never awaited RuntimeWarning: Enable tracemalloc to get the object allocation traceback Config

encoding_model: cl100k_base skip_workflows: [] llm: api_key: ${GRAPHRAG_API_KEY} type: openai_chat # or azure_openai_chat model: llama3.1 model_supports_json: true # recommended if this is available for your model. api_base: http://localhost:11434/v1

parallelization: stagger: 0.3 num_threads: 10 # the number of threads to use for parallel processing

async_mode: threaded # or asyncio

embeddings: async_mode: threaded # or asyncio llm: api_key: ${GRAPHRAG_API_KEY} type: openai_embedding # or azure_openai_embedding model: nomic-embed-text api_base: http://localhost:11434/api

chunks: size: 1000 overlap: 100 group_by_columns: [id] # by default, we don't allow chunks to cross documents