Local models support for Microsoft's graphrag using ollama (llama3, mistral, gemma2 phi3)- LLM & Embedding extraction
MIT License
778
stars
116
forks
source link
Generating with num_threads on:sys:1: RuntimeWarning: coroutine 'to_thread' was never awaited RuntimeWarning: Enable tracemalloc to get the object allocation traceback #66
Traceback:
├── create_base_text_units
└── create_base_extracted_entities
└── Verb entity_extract ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1% 0:00:05 0:00:37
All tasks cancelled. Exiting...
Traceback (most recent call last):
File "", line 198, in _run_module_as_main
File "", line 88, in _run_code
File "/mnt/codes/graphrag/graphrag-local-ollama/graphrag/index/main.py", line 76, in
index_cli(
File "/mnt/codes/graphrag/graphrag-local-ollama/graphrag/index/cli.py", line 161, in index_cli
_run_workflow_async()
File "/mnt/codes/graphrag/graphrag-local-ollama/graphrag/index/cli.py", line 154, in _run_workflow_async
runner.run(execute())
File "/mnt/anaconda3_install/envs/graphrag-ollama-local/lib/python3.11/asyncio/runners.py", line 118, in run
return self._loop.run_until_complete(task)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "uvloop/loop.pyx", line 1517, in uvloop.loop.Loop.run_until_complete
File "/mnt/codes/graphrag/graphrag-local-ollama/graphrag/index/cli.py", line 123, in execute
async for output in run_pipeline_with_config(
File "/mnt/codes/graphrag/graphrag-local-ollama/graphrag/index/run.py", line 154, in run_pipeline_with_config
async for table in run_pipeline(
File "/mnt/codes/graphrag/graphrag-local-ollama/graphrag/index/run.py", line 323, in run_pipeline
result = await workflow.run(context, callbacks)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/anaconda3_install/envs/graphrag-ollama-local/lib/python3.11/site-packages/datashaper/workflow/workflow.py", line 369, in run
timing = await self._execute_verb(node, context, callbacks)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/anaconda3_install/envs/graphrag-ollama-local/lib/python3.11/site-packages/datashaper/workflow/workflow.py", line 415, in _execute_verb
result = await result
^^^^^^^^^^^^
File "/mnt/codes/graphrag/graphrag-local-ollama/graphrag/index/verbs/entities/extraction/entity_extract.py", line 161, in entity_extract
results = await derive_from_rows(
^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/anaconda3_install/envs/graphrag-ollama-local/lib/python3.11/site-packages/datashaper/execution/derive_from_rows.py", line 33, in derive_from_rows
return await derive_from_rows_asyncio_threads(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/anaconda3_install/envs/graphrag-ollama-local/lib/python3.11/site-packages/datashaper/execution/derive_from_rows_asyncio_threads.py", line 40, in derive_from_rows_asyncio_threads
return await derive_from_rows_base(input, transform, callbacks, gather)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/anaconda3_install/envs/graphrag-ollama-local/lib/python3.11/site-packages/datashaper/execution/derive_from_rows_base.py", line 49, in derive_from_rows_base
result = await gather(execute)
^^^^^^^^^^^^^^^^^^^^^
File "/mnt/anaconda3_install/envs/graphrag-ollama-local/lib/python3.11/site-packages/datashaper/execution/derive_from_rows_asyncio_threads.py", line 38, in gather
return await asyncio.gather(*[execute_task(task) for task in tasks])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/anaconda3_install/envs/graphrag-ollama-local/lib/python3.11/site-packages/datashaper/execution/derive_from_rows_asyncio_threads.py", line 33, in execute_task
async with semaphore:
File "/mnt/anaconda3_install/envs/graphrag-ollama-local/lib/python3.11/asyncio/locks.py", line 15, in aenter
await self.acquire()
File "/mnt/anaconda3_install/envs/graphrag-ollama-local/lib/python3.11/asyncio/locks.py", line 387, in acquire
await fut
asyncio.exceptions.CancelledError
sys:1: RuntimeWarning: coroutine 'to_thread' was never awaited
RuntimeWarning: Enable tracemalloc to get the object allocation traceback
Config
encoding_model: cl100k_base
skip_workflows: []
llm:
api_key: ${GRAPHRAG_API_KEY}
type: openai_chat # or azure_openai_chat
model: llama3.1
model_supports_json: true # recommended if this is available for your model.
api_base: http://localhost:11434/v1
parallelization:
stagger: 0.3
num_threads: 10 # the number of threads to use for parallel processing
async_mode: threaded # or asyncio
embeddings:
async_mode: threaded # or asyncio
llm:
api_key: ${GRAPHRAG_API_KEY}
type: openai_embedding # or azure_openai_embedding
model: nomic-embed-text
api_base: http://localhost:11434/api
chunks:
size: 1000
overlap: 100
group_by_columns: [id] # by default, we don't allow chunks to cross documents
Traceback: ├── create_base_text_units └── create_base_extracted_entities └── Verb entity_extract ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1% 0:00:05 0:00:37 All tasks cancelled. Exiting... Traceback (most recent call last): File "", line 198, in _run_module_as_main
File "", line 88, in _run_code
File "/mnt/codes/graphrag/graphrag-local-ollama/graphrag/index/main.py", line 76, in
index_cli(
File "/mnt/codes/graphrag/graphrag-local-ollama/graphrag/index/cli.py", line 161, in index_cli
_run_workflow_async()
File "/mnt/codes/graphrag/graphrag-local-ollama/graphrag/index/cli.py", line 154, in _run_workflow_async
runner.run(execute())
File "/mnt/anaconda3_install/envs/graphrag-ollama-local/lib/python3.11/asyncio/runners.py", line 118, in run
return self._loop.run_until_complete(task)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "uvloop/loop.pyx", line 1517, in uvloop.loop.Loop.run_until_complete
File "/mnt/codes/graphrag/graphrag-local-ollama/graphrag/index/cli.py", line 123, in execute
async for output in run_pipeline_with_config(
File "/mnt/codes/graphrag/graphrag-local-ollama/graphrag/index/run.py", line 154, in run_pipeline_with_config
async for table in run_pipeline(
File "/mnt/codes/graphrag/graphrag-local-ollama/graphrag/index/run.py", line 323, in run_pipeline
result = await workflow.run(context, callbacks)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/anaconda3_install/envs/graphrag-ollama-local/lib/python3.11/site-packages/datashaper/workflow/workflow.py", line 369, in run
timing = await self._execute_verb(node, context, callbacks)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/anaconda3_install/envs/graphrag-ollama-local/lib/python3.11/site-packages/datashaper/workflow/workflow.py", line 415, in _execute_verb
result = await result
^^^^^^^^^^^^
File "/mnt/codes/graphrag/graphrag-local-ollama/graphrag/index/verbs/entities/extraction/entity_extract.py", line 161, in entity_extract
results = await derive_from_rows(
^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/anaconda3_install/envs/graphrag-ollama-local/lib/python3.11/site-packages/datashaper/execution/derive_from_rows.py", line 33, in derive_from_rows
return await derive_from_rows_asyncio_threads(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/anaconda3_install/envs/graphrag-ollama-local/lib/python3.11/site-packages/datashaper/execution/derive_from_rows_asyncio_threads.py", line 40, in derive_from_rows_asyncio_threads
return await derive_from_rows_base(input, transform, callbacks, gather)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/anaconda3_install/envs/graphrag-ollama-local/lib/python3.11/site-packages/datashaper/execution/derive_from_rows_base.py", line 49, in derive_from_rows_base
result = await gather(execute)
^^^^^^^^^^^^^^^^^^^^^
File "/mnt/anaconda3_install/envs/graphrag-ollama-local/lib/python3.11/site-packages/datashaper/execution/derive_from_rows_asyncio_threads.py", line 38, in gather
return await asyncio.gather(*[execute_task(task) for task in tasks])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/anaconda3_install/envs/graphrag-ollama-local/lib/python3.11/site-packages/datashaper/execution/derive_from_rows_asyncio_threads.py", line 33, in execute_task
async with semaphore:
File "/mnt/anaconda3_install/envs/graphrag-ollama-local/lib/python3.11/asyncio/locks.py", line 15, in aenter
await self.acquire()
File "/mnt/anaconda3_install/envs/graphrag-ollama-local/lib/python3.11/asyncio/locks.py", line 387, in acquire
await fut
asyncio.exceptions.CancelledError
sys:1: RuntimeWarning: coroutine 'to_thread' was never awaited
RuntimeWarning: Enable tracemalloc to get the object allocation traceback
Config
encoding_model: cl100k_base skip_workflows: [] llm: api_key: ${GRAPHRAG_API_KEY} type: openai_chat # or azure_openai_chat model: llama3.1 model_supports_json: true # recommended if this is available for your model. api_base: http://localhost:11434/v1
parallelization: stagger: 0.3 num_threads: 10 # the number of threads to use for parallel processing
async_mode: threaded # or asyncio
embeddings: async_mode: threaded # or asyncio llm: api_key: ${GRAPHRAG_API_KEY} type: openai_embedding # or azure_openai_embedding model: nomic-embed-text api_base: http://localhost:11434/api
chunks: size: 1000 overlap: 100 group_by_columns: [id] # by default, we don't allow chunks to cross documents