microsoft / graphrag

A modular graph-based Retrieval-Augmented Generation (RAG) system

https://microsoft.github.io/graphrag/

MIT License

12.89k stars 1.08k forks source link

[Ollama][Other] GraphRAG OSS LLM community support #339

Closed samalanubhab closed 2 days ago

samalanubhab commented 3 weeks ago

What I tried: I ran this on my local GPU and and tried replacing the api_base to a model served on ollama in settings.yaml file. model: llama3:latest api_base: http://localhost:11434/v1 #https://.openai.azure.com

Error: graphrag.index.reporting.file_workflow_callbacks INFO Error Invoking LLM details={'input': '\n-Goal-\nGiven a text document that is pot....}

Commands:

initialize

python -m graphrag.index --init --root .

index

python -m graphrag.index --root .

query

python -m graphrag.query --root . --method global "query"

query

python -m graphrag.query --root . --method local "query"

Does graphrag support other llm hosted server frameworks?

aaronrsiphone commented 3 weeks ago

Calm down no need to yell.

Looking at the logs it looks like they are removing the port from the api_base.

settings.yaml -> api_base: "http://127.0.0.1:5000/v1" erros in logs: http://127.0.0.1/v1/chat/completions

dx111ge commented 3 weeks ago

Calm down no need to yell.

Looking at the logs it looks like they are removing the port from the api_base.

settings.yaml -> api_base: "http://127.0.0.1:5000/v1" erros in logs: http://127.0.0.1/v1/chat/completions

Sorry for question this, i have the same error and I'm not sure what you try to tell ? at least i did try api_base: http://127.0.0.1:5000/v1 and api_base: http://localhost:11434/v1 but in both cases same error (btw: LLm -> llama3 and embedding nomic-embed-text)
Thanks for clarification

bmaltais commented 3 weeks ago

This is anoying... I just tried switching to ollama because... my 1st attempt at running the solution against chat-gpt costed me 45$ and did not work at the end... so I don't want to waste money testing things like that. I would rather take it slow and steady locally until I get the hang of it and switch to a paid model if needed...

How can we force the port to stay? I installed using pip install graphrag... I wish I knew what file to hack to keep the port intact.

dx111ge commented 3 weeks ago

This is anoying... I just tried switching to ollama because... my 1st attempt at running the solution against chat-gpt costed me 45$ and did not work at the end... so I don't want to waste money testing things like that. I would rather take it slow and steady locally until I get the hang of it and switch to a paid model if needed...

How can we force the port to stay? I installed using pip install graphrag... I wish I knew what file to hack to keep the port intact.

OLLAMA_HOST=127.0.0.1:11435 ollama serve ... now we just need to know which port graphrag is looking for

bmaltais commented 3 weeks ago

Good news. I got it started. Key was to use the right config to set the concurrent request to 1:

llm:
  api_key: ${GRAPHRAG_API_KEY}
  type: openai_chat # or azure_openai_chat
  model: llama3
  model_supports_json: true # recommended if this is available for your model.
  max_tokens: 2000
  # request_timeout: 180.0
  api_base: http://localhost:11434/v1
  # api_version: 2024-02-15-preview
  # organization: <organization_id>
  # deployment_name: <azure_model_deployment_name>
  # tokens_per_minute: 150_000 # set a leaky bucket throttle
  # requests_per_minute: 10_000 # set a leaky bucket throttle
  # max_retries: 10
  # max_retry_wait: 10.0
  # sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times
  concurrent_requests: 1 # the number of parallel inflight requests that may be made

SeppeVanSteenbergen commented 3 weeks ago

I also managed to get the entity extraction working with Ollama. However, the embeddings seem to be more tricky due to no available OpenAI compatible API for embeddings from Ollama. Anyone found a workaround for this already?

bmaltais commented 3 weeks ago

I also managed to get the entity extraction working with Ollama. However, the embeddings seem to be more tricky due to no available OpenAI compatible API for embeddings from Ollama. Anyone found a workaround for this already?

Is it the cause of this error afterprocessing the entities:

🚀 Reading settings from settings.yaml
H:\llm_stuff\graphrag\venv\lib\site-packages\numpy\core\fromnumeric.py:59: FutureWarning: 'DataFrame.swapaxes' is deprecated and will be    
removed in a future version. Please use 'DataFrame.transpose' instead.
  return bound(*args, **kwds)
🚀 create_base_text_units
                                  id                                              chunk  ...                        document_ids n_tokens   
0   c6b76a5684badf7d2437c09ab8b5b099  DE TAI L E D DE S I G N SP E CI F I CAT I O N ...  ...        300
1   86c6c7ef7455630118790c9325ccac7d   Unclassified Status: DRFT Subject: SSC NAMING...  ...        300
2   55644895acb440fe7ce68b445aca9340  c – first draft SSC Cloud R&D\nv0.6 2019-09-16...  ...        300
3   437b3490966fcf3d1d8f0f89e3b0209a  .2 2019-12-14 TBS Feedback Updated TBS Governa...  ...        300
4   a60f3dd9757d19a311161ec3ff5d5cd3  12-29 SSC Cloud teams\nReplace field value tab...  ...        300
..                               ...                                                ...  ...                                 ...      ...   
86  11feb9a0b5f521cf93e7a9a06925e3ad   dependencies for maintenance (i.e. windows, p...  ...        300
87  f73a3685cdb77d0345371e90ace81ef3   uses Enterprise Control Desk (ECD) resolver g...  ...        300
88  850737c0488f6fdad4780cb0f4e7e98e  s Canadian Nuclear Safety Commission 12 CSA Sa...  ...        300
89  f1f210b06012245d9a9ec4d6672f1536  : DRFT Subject: SSC NAMING AND TAGGING STANDAR...  ...        212
90  4c9136411292f396d8545750d87c4ed2   Health Canada (Department of)\nTable 15: Depa...  ...         12

[91 rows x 5 columns]
🚀 create_base_extracted_entities
                                        entity_graph
0  <graphml xmlns="http://graphml.graphdrawing.or...
🚀 create_summarized_entities
                                        entity_graph
0  <graphml xmlns="http://graphml.graphdrawing.or...
🚀 create_base_entity_graph
   level                                    clustered_graph
0      0  <graphml xmlns="http://graphml.graphdrawing.or...
1      1  <graphml xmlns="http://graphml.graphdrawing.or...
H:\llm_stuff\graphrag\venv\lib\site-packages\numpy\core\fromnumeric.py:59: FutureWarning: 'DataFrame.swapaxes' is deprecated and will be    
removed in a future version. Please use 'DataFrame.transpose' instead.
  return bound(*args, **kwds)
❌ create_final_entities
None
⠦ GraphRAG Indexer
├── Loading Input (text) - 1 files loaded (0 filtered) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 0:00:00
├── create_base_text_units
├── create_base_extracted_entities
├── create_summarized_entities
├── create_base_entity_graph
└── create_final_entities
❌ Errors occurred during the pipeline run, see logs for more details.

bmaltais commented 3 weeks ago

I configured mine as:

embeddings:
  ## parallelization: override the global parallelization settings for embeddings
  async_mode: threaded # or asyncio
  llm:
    api_key: ${GRAPHRAG_API_KEY}
    type: openai_embedding # or azure_openai_embedding
    model: nomic-embed-text
    api_base: http://localhost:11434/v1
    # api_version: 2024-02-15-preview
    # organization: <organization_id>
    # deployment_name: <azure_model_deployment_name>
    # tokens_per_minute: 150_000 # set a leaky bucket throttle
    # requests_per_minute: 10_000 # set a leaky bucket throttle
    # max_retries: 10
    # max_retry_wait: 10.0
    # sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times
    concurrent_requests: 1 # the number of parallel inflight requests that may be made
    # batch_size: 16 # the number of documents to send in a single request
    # batch_max_tokens: 8191 # the maximum number of tokens to send in a single request
    # target: required # or optional

bmaltais commented 3 weeks ago

The crash log state:

08:57:11,537 datashaper.workflow.workflow ERROR Error executing verb "text_embed" in create_final_entities: 404 page not found
Traceback (most recent call last):
  File "H:\llm_stuff\graphrag\venv\lib\site-packages\datashaper\workflow\workflow.py", line 415, in _execute_verb
    result = await result
  File "H:\llm_stuff\graphrag\venv\lib\site-packages\graphrag\index\verbs\text\embed\text_embed.py", line 105, in text_embed
    return await _text_embed_in_memory(
  File "H:\llm_stuff\graphrag\venv\lib\site-packages\graphrag\index\verbs\text\embed\text_embed.py", line 130, in _text_embed_in_memory
    result = await strategy_exec(texts, callbacks, cache, strategy_args)
  File "H:\llm_stuff\graphrag\venv\lib\site-packages\graphrag\index\verbs\text\embed\strategies\openai.py", line 61, in run
    embeddings = await _execute(llm, text_batches, ticker, semaphore)
  File "H:\llm_stuff\graphrag\venv\lib\site-packages\graphrag\index\verbs\text\embed\strategies\openai.py", line 105, in _execute
    results = await asyncio.gather(*futures)
  File "C:\Users\berna\AppData\Local\Programs\Python\Python310\lib\asyncio\tasks.py", line 304, in __wakeup
    future.result()
  File "C:\Users\berna\AppData\Local\Programs\Python\Python310\lib\asyncio\tasks.py", line 232, in __step
    result = coro.send(None)
  File "H:\llm_stuff\graphrag\venv\lib\site-packages\graphrag\index\verbs\text\embed\strategies\openai.py", line 99, in embed
    chunk_embeddings = await llm(chunk)
  File "H:\llm_stuff\graphrag\venv\lib\site-packages\graphrag\llm\base\caching_llm.py", line 104, in __call__
    result = await self._delegate(input, **kwargs)
  File "H:\llm_stuff\graphrag\venv\lib\site-packages\graphrag\llm\base\rate_limiting_llm.py", line 177, in __call__
    result, start = await execute_with_retry()
  File "H:\llm_stuff\graphrag\venv\lib\site-packages\graphrag\llm\base\rate_limiting_llm.py", line 159, in execute_with_retry
    async for attempt in retryer:
  File "H:\llm_stuff\graphrag\venv\lib\site-packages\tenacity\asyncio\__init__.py", line 166, in __anext__
    do = await self.iter(retry_state=self._retry_state)
  File "H:\llm_stuff\graphrag\venv\lib\site-packages\tenacity\asyncio\__init__.py", line 153, in iter
    result = await action(retry_state)
  File "H:\llm_stuff\graphrag\venv\lib\site-packages\tenacity\_utils.py", line 99, in inner
    return call(*args, **kwargs)
  File "H:\llm_stuff\graphrag\venv\lib\site-packages\tenacity\__init__.py", line 398, in <lambda>
    self._add_action_func(lambda rs: rs.outcome.result())
  File "C:\Users\berna\AppData\Local\Programs\Python\Python310\lib\concurrent\futures\_base.py", line 451, in result
    return self.__get_result()
  File "C:\Users\berna\AppData\Local\Programs\Python\Python310\lib\concurrent\futures\_base.py", line 403, in __get_result
    raise self._exception
  File "H:\llm_stuff\graphrag\venv\lib\site-packages\graphrag\llm\base\rate_limiting_llm.py", line 165, in execute_with_retry
    return await do_attempt(), start
  File "H:\llm_stuff\graphrag\venv\lib\site-packages\graphrag\llm\base\rate_limiting_llm.py", line 147, in do_attempt
    return await self._delegate(input, **kwargs)
  File "H:\llm_stuff\graphrag\venv\lib\site-packages\graphrag\llm\base\base_llm.py", line 49, in __call__
    return await self._invoke(input, **kwargs)
  File "H:\llm_stuff\graphrag\venv\lib\site-packages\graphrag\llm\base\base_llm.py", line 53, in _invoke
    output = await self._execute_llm(input, **kwargs)
  File "H:\llm_stuff\graphrag\venv\lib\site-packages\graphrag\llm\openai\openai_embeddings_llm.py", line 36, in _execute_llm
    embedding = await self.client.embeddings.create(
  File "H:\llm_stuff\graphrag\venv\lib\site-packages\openai\resources\embeddings.py", line 215, in create
    return await self._post(
  File "H:\llm_stuff\graphrag\venv\lib\site-packages\openai\_base_client.py", line 1816, in post
    return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)
  File "H:\llm_stuff\graphrag\venv\lib\site-packages\openai\_base_client.py", line 1514, in request
    return await self._request(
  File "H:\llm_stuff\graphrag\venv\lib\site-packages\openai\_base_client.py", line 1610, in _request
    raise self._make_status_error_from_response(err.response) from None
openai.NotFoundError: 404 page not found

and ollama log show:

[GIN] 2024/07/03 - 08:57:11 | 404 | 0s | 127.0.0.1 | POST "/v1/embeddings"

dx111ge commented 3 weeks ago

I configured mine as:

embeddings:
  ## parallelization: override the global parallelization settings for embeddings
  async_mode: threaded # or asyncio
  llm:
    api_key: ${GRAPHRAG_API_KEY}
    type: openai_embedding # or azure_openai_embedding
    model: nomic-embed-text
    api_base: http://localhost:11434/v1
    # api_version: 2024-02-15-preview
    # organization: <organization_id>
    # deployment_name: <azure_model_deployment_name>
    # tokens_per_minute: 150_000 # set a leaky bucket throttle
    # requests_per_minute: 10_000 # set a leaky bucket throttle
    # max_retries: 10
    # max_retry_wait: 10.0
    # sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times
    concurrent_requests: 1 # the number of parallel inflight requests that may be made
    # batch_size: 16 # the number of documents to send in a single request
    # batch_max_tokens: 8191 # the maximum number of tokens to send in a single request
    # target: required # or optional

use api instead v1 👍 14:55:29,949 graphrag.index.verbs.text.embed.strategies.openai INFO embedding 9 inputs via 9 snippets using 1 batches. max_batch_size=16, max_tokens=8191 14:55:31,373 httpx INFO HTTP Request: POST http://127.0.0.1:11434/api/embeddings "HTTP/1.1 200 OK" 14:55:31,375 graphrag.index.reporting.file_workflow_callbacks INFO Error Invoking LLM details={'input': ['"THE TEAM":"The team is portrayed as a group of individuals who have transitioned from passive observers to active participants in a mission, showing a dynamic change in their role."', '"WASHINGTON":', '"OPERATION: DULCE":', '"ALEX":"Alex is the leader of a team attempting first contact with an unknown intelligence, acknowledging the significance of their task."', '"CONTROL":"Control refers to the ability to manage or govern, which is challenged by an intelligence that writes its own rules."', '"INTELLIGENCE":"Intelligence here refers to an unknown entity capable of writing its own rules and learning to communicate."', '"FIRST CONTACT":"First Contact is the potential initial communication between humanity and an unknown intelligence."', '"SAM RIVERA":', '"HUMANITY\' RESPONSE":']} 14:55:31,375 datashaper.workflow.workflow ERROR Error executing verb "text_embed" in create_final_entities: 'NoneType' object is not iterable

at least i get a ok for the embedding, but format seems wrong

SeppeVanSteenbergen commented 3 weeks ago

Yes, there is no embeddings endpoint under the v1 of the OpenAI compatible server within Ollama. They are actively working on this: https://github.com/ollama/ollama/pull/5285

SeppeVanSteenbergen commented 3 weeks ago

Indeed, so I also tried the normal api endpoint as @dx111ge and having the same problem with the embedding output

bmaltais commented 3 weeks ago

I also figured the v1 <-> api and I am now stuck with the same final error...

dx111ge commented 3 weeks ago

did try all 3 different ollama embedding models , same error mxbai-embed-large nomic-embed-text and all-minilm

bmaltais commented 3 weeks ago

The weird thing... I reverted the embedings to be openai... but it try to connect to ollama instead... like it is getting the api_base from the llm config for the entities... I wonder what the right api_base might be for openai embeds... maybe we need to set it if we use a custom one for llm?

bmaltais commented 3 weeks ago

OK, I have been able to specify the openai embeddings API (https://api.openai.com/v1) and moved pas that point... but now... it is failing at the

└── create_final_community_reports
    └── Verb create_community_reports ━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━━━━  55% 0:00:29 0:00:46

Running out of memory on my 3090... I tried reducing the max_input_length to no avail:

community_report:
  ## llm: override the global llm settings for this task
  ## parallelization: override the global parallelization settings for this task
  ## async_mode: override the global async_mode settings for this task
  prompt: "prompts/community_report.txt"
  max_length: 2000
  max_input_length: 4000

dx111ge commented 3 weeks ago

OK, I have been able to specify the openai embeddings API (https://api.openai.com/v1) and moved pas that point... but now... it is failing at the
└── create_final_community_reports
    └── Verb create_community_reports ━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━━━━  55% 0:00:29 0:00:46
Running out of memory on my 3090... I tried reducing the max_input_length to no avail:
community_report:
  ## llm: override the global llm settings for this task
  ## parallelization: override the global parallelization settings for this task
  ## async_mode: override the global async_mode settings for this task
  prompt: "prompts/community_report.txt"
  max_length: 2000
  max_input_length: 4000

can you please explain what you did ? to fix the embeddings stuff?

bmaltais commented 3 weeks ago

Here is my final config. Somehow after VSCode crashed the summary reports started working when I started it again.

Here is my final full config that work so far:


encoding_model: cl100k_base
skip_workflows: []
llm:
  api_key: ${GRAPHRAG_API_KEY}
  type: openai_chat # or azure_openai_chat
  model: llama3
  model_supports_json: true # recommended if this is available for your model.
  max_tokens: 2000
  # request_timeout: 180.0
  api_base: http://localhost:11434/v1
  # api_version: 2024-02-15-preview
  # organization: <organization_id>
  # deployment_name: <azure_model_deployment_name>
  # tokens_per_minute: 150_000 # set a leaky bucket throttle
  # requests_per_minute: 10_000 # set a leaky bucket throttle
  max_retries: 1
  # max_retry_wait: 10.0
  # sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times
  concurrent_requests: 1 # the number of parallel inflight requests that may be made

parallelization:
  stagger: 0.3
  # num_threads: 50 # the number of threads to use for parallel processing

async_mode: threaded # or asyncio

embeddings:
  ## parallelization: override the global parallelization settings for embeddings
  async_mode: threaded # or asyncio
  llm:
    api_key: ${GRAPHRAG_API_KEY}
    type: openai_embedding # or azure_openai_embedding
    model: text-embedding-3-small
    api_base: https://api.openai.com/v1
    # api_version: 2024-02-15-preview
    # organization: <organization_id>
    # deployment_name: <azure_model_deployment_name>
    # tokens_per_minute: 150_000 # set a leaky bucket throttle
    # requests_per_minute: 10_000 # set a leaky bucket throttle
    max_retries: 1
    # max_retry_wait: 10.0
    # sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times
    concurrent_requests: 1 # the number of parallel inflight requests that may be made
    batch_size: 1 # the number of documents to send in a single request
    batch_max_tokens: 8191 # the maximum number of tokens to send in a single request
    # target: required # or optional

chunks:
  size: 300
  overlap: 100
  group_by_columns: [id] # by default, we don't allow chunks to cross documents

input:
  type: file # or blob
  file_type: text # or csv
  base_dir: "input"
  file_encoding: utf-8
  file_pattern: ".*\\.txt$"

cache:
  type: file # or blob
  base_dir: "cache"
  # connection_string: <azure_blob_storage_connection_string>
  # container_name: <azure_blob_storage_container_name>

storage:
  type: file # or blob
  base_dir: "output/${timestamp}/artifacts"
  # connection_string: <azure_blob_storage_connection_string>
  # container_name: <azure_blob_storage_container_name>

reporting:
  type: file # or console, blob
  base_dir: "output/${timestamp}/reports"
  # connection_string: <azure_blob_storage_connection_string>
  # container_name: <azure_blob_storage_container_name>

entity_extraction:
  ## llm: override the global llm settings for this task
  ## parallelization: override the global parallelization settings for this task
  ## async_mode: override the global async_mode settings for this task
  prompt: "prompts/entity_extraction.txt"
  entity_types: [organization,person,geo,event]
  max_gleanings: 0

summarize_descriptions:
  ## llm: override the global llm settings for this task
  ## parallelization: override the global parallelization settings for this task
  ## async_mode: override the global async_mode settings for this task
  prompt: "prompts/summarize_descriptions.txt"
  max_length: 500

claim_extraction:
  ## llm: override the global llm settings for this task
  ## parallelization: override the global parallelization settings for this task
  ## async_mode: override the global async_mode settings for this task
  # enabled: true
  prompt: "prompts/claim_extraction.txt"
  description: "Any claims or facts that could be relevant to information discovery."
  max_gleanings: 0

community_report:
  ## llm: override the global llm settings for this task
  ## parallelization: override the global parallelization settings for this task
  ## async_mode: override the global async_mode settings for this task
  prompt: "prompts/community_report.txt"
  max_length: 2000
  max_input_length: 7000

cluster_graph:
  max_cluster_size: 10

embed_graph:
  enabled: false # if true, will generate node2vec embeddings for nodes
  # num_walks: 10
  # walk_length: 40
  # window_size: 2
  # iterations: 3
  # random_seed: 597832

umap:
  enabled: false # if true, will generate UMAP embeddings for nodes

snapshots:
  graphml: false
  raw_entities: false
  top_level_nodes: false

local_search:
  # text_unit_prop: 0.5
  # community_prop: 0.1
  # conversation_history_max_turns: 5
  # top_k_mapped_entities: 10
  # top_k_relationships: 10
  # max_tokens: 12000

global_search:
  # max_tokens: 12000
  # data_max_tokens: 12000
  # map_max_tokens: 1000
  # reduce_max_tokens: 2000
  # concurrency: 32

Essentially I use llama3 localy via ollama for the entities and use openai embeddings (much cheaper) until we have a solution to use ollama.

bmaltais commented 3 weeks ago

I am sure the config could be optimised... but this is working at the moment... now I need to test the query part ;-)

bmaltais commented 3 weeks ago

Well... look like I can't query the results. Keep getting VRAM errors on my 3090... so all this to not be able to query ;-(

antiblahblah commented 3 weeks ago

Essentially I use llama3 localy via ollama for the entities and use openai embeddings (much cheaper) until we have a solution to use ollama.

OpenAI's embeddings are quite expensive too...

bmaltais commented 3 weeks ago

I figured the issue with the query... somehow the youtube video I was following was using the "wrong" syntax?

Did not work: python -m graphrag.query --root . --method global "What are the highlights of the naming convention"

Worked: python -m graphrag.query --data .\output\20240703-084750\artifacts\ --method global "What are the highlights of the naming convention"

KarimJedda commented 3 weeks ago

@bmaltais thanks a lot. That works.

I think vllm has embeddings now, I will try that tonight for a fully local setup 👍

bmaltais commented 3 weeks ago

Quick update... Some of the issues I was having was related to the fact that my 1st attempt at running hraphrag was leveraginf chatgpt-4o. It ended-up creating a lot of files in the cache folder that then got mixed with llama3 generated files. Overall this caused significant issues.

After deleting the cache folder and re-indexing everything I was able to properly query the graph with:

python -m graphrag.query --method local --root . "What does Identity Lifecycle concist of?"

I still have not found an easy solution to generating embeddings locally.

XinyuShe commented 2 weeks ago

Here is my final config. Somehow after VSCode crashed the summary reports started working when I started it again.

Here is my final full config that work so far:


encoding_model: cl100k_base
skip_workflows: []
llm:
  api_key: ${GRAPHRAG_API_KEY}
  type: openai_chat # or azure_openai_chat
  model: llama3
  model_supports_json: true # recommended if this is available for your model.
  max_tokens: 2000
  # request_timeout: 180.0
  api_base: http://localhost:11434/v1
  # api_version: 2024-02-15-preview
  # organization: <organization_id>
  # deployment_name: <azure_model_deployment_name>
  # tokens_per_minute: 150_000 # set a leaky bucket throttle
  # requests_per_minute: 10_000 # set a leaky bucket throttle
  max_retries: 1
  # max_retry_wait: 10.0
  # sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times
  concurrent_requests: 1 # the number of parallel inflight requests that may be made

parallelization:
  stagger: 0.3
  # num_threads: 50 # the number of threads to use for parallel processing

async_mode: threaded # or asyncio

embeddings:
  ## parallelization: override the global parallelization settings for embeddings
  async_mode: threaded # or asyncio
  llm:
    api_key: ${GRAPHRAG_API_KEY}
    type: openai_embedding # or azure_openai_embedding
    model: text-embedding-3-small
    api_base: https://api.openai.com/v1
    # api_version: 2024-02-15-preview
    # organization: <organization_id>
    # deployment_name: <azure_model_deployment_name>
    # tokens_per_minute: 150_000 # set a leaky bucket throttle
    # requests_per_minute: 10_000 # set a leaky bucket throttle
    max_retries: 1
    # max_retry_wait: 10.0
    # sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times
    concurrent_requests: 1 # the number of parallel inflight requests that may be made
    batch_size: 1 # the number of documents to send in a single request
    batch_max_tokens: 8191 # the maximum number of tokens to send in a single request
    # target: required # or optional

chunks:
  size: 300
  overlap: 100
  group_by_columns: [id] # by default, we don't allow chunks to cross documents

input:
  type: file # or blob
  file_type: text # or csv
  base_dir: "input"
  file_encoding: utf-8
  file_pattern: ".*\\.txt$"

cache:
  type: file # or blob
  base_dir: "cache"
  # connection_string: <azure_blob_storage_connection_string>
  # container_name: <azure_blob_storage_container_name>

storage:
  type: file # or blob
  base_dir: "output/${timestamp}/artifacts"
  # connection_string: <azure_blob_storage_connection_string>
  # container_name: <azure_blob_storage_container_name>

reporting:
  type: file # or console, blob
  base_dir: "output/${timestamp}/reports"
  # connection_string: <azure_blob_storage_connection_string>
  # container_name: <azure_blob_storage_container_name>

entity_extraction:
  ## llm: override the global llm settings for this task
  ## parallelization: override the global parallelization settings for this task
  ## async_mode: override the global async_mode settings for this task
  prompt: "prompts/entity_extraction.txt"
  entity_types: [organization,person,geo,event]
  max_gleanings: 0

summarize_descriptions:
  ## llm: override the global llm settings for this task
  ## parallelization: override the global parallelization settings for this task
  ## async_mode: override the global async_mode settings for this task
  prompt: "prompts/summarize_descriptions.txt"
  max_length: 500

claim_extraction:
  ## llm: override the global llm settings for this task
  ## parallelization: override the global parallelization settings for this task
  ## async_mode: override the global async_mode settings for this task
  # enabled: true
  prompt: "prompts/claim_extraction.txt"
  description: "Any claims or facts that could be relevant to information discovery."
  max_gleanings: 0

community_report:
  ## llm: override the global llm settings for this task
  ## parallelization: override the global parallelization settings for this task
  ## async_mode: override the global async_mode settings for this task
  prompt: "prompts/community_report.txt"
  max_length: 2000
  max_input_length: 7000

cluster_graph:
  max_cluster_size: 10

embed_graph:
  enabled: false # if true, will generate node2vec embeddings for nodes
  # num_walks: 10
  # walk_length: 40
  # window_size: 2
  # iterations: 3
  # random_seed: 597832

umap:
  enabled: false # if true, will generate UMAP embeddings for nodes

snapshots:
  graphml: false
  raw_entities: false
  top_level_nodes: false

local_search:
  # text_unit_prop: 0.5
  # community_prop: 0.1
  # conversation_history_max_turns: 5
  # top_k_mapped_entities: 10
  # top_k_relationships: 10
  # max_tokens: 12000

global_search:
  # max_tokens: 12000
  # data_max_tokens: 12000
  # map_max_tokens: 1000
  # reduce_max_tokens: 2000
  # concurrency: 32

Essentially I use llama3 localy via ollama for the entities and use openai embeddings (much cheaper) until we have a solution to use ollama.

I use your setting and the default text, and do not change any other thing, but still

❌ create_final_entities
None
⠹ GraphRAG Indexer 
├── Loading Input (InputFileType.text) - 1 files loaded (0 filtered) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 0:00:00
├── create_base_text_units
├── create_base_extracted_entities
├── create_summarized_entities
├── create_base_entity_graph
└── create_final_entities
❌ Errors occurred during the pipeline run, see logs for more details.

AntoninLeroy commented 2 weeks ago

Any sucess using vllm inference endpoint for local LLMs ?

vamshi-rvk commented 2 weeks ago

Here is my final config. Somehow after VSCode crashed the summary reports started working when I started it again. Here is my final full config that work so far:


encoding_model: cl100k_base
skip_workflows: []
llm:
  api_key: ${GRAPHRAG_API_KEY}
  type: openai_chat # or azure_openai_chat
  model: llama3
  model_supports_json: true # recommended if this is available for your model.
  max_tokens: 2000
  # request_timeout: 180.0
  api_base: http://localhost:11434/v1
  # api_version: 2024-02-15-preview
  # organization: <organization_id>
  # deployment_name: <azure_model_deployment_name>
  # tokens_per_minute: 150_000 # set a leaky bucket throttle
  # requests_per_minute: 10_000 # set a leaky bucket throttle
  max_retries: 1
  # max_retry_wait: 10.0
  # sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times
  concurrent_requests: 1 # the number of parallel inflight requests that may be made

parallelization:
  stagger: 0.3
  # num_threads: 50 # the number of threads to use for parallel processing

async_mode: threaded # or asyncio

embeddings:
  ## parallelization: override the global parallelization settings for embeddings
  async_mode: threaded # or asyncio
  llm:
    api_key: ${GRAPHRAG_API_KEY}
    type: openai_embedding # or azure_openai_embedding
    model: text-embedding-3-small
    api_base: https://api.openai.com/v1
    # api_version: 2024-02-15-preview
    # organization: <organization_id>
    # deployment_name: <azure_model_deployment_name>
    # tokens_per_minute: 150_000 # set a leaky bucket throttle
    # requests_per_minute: 10_000 # set a leaky bucket throttle
    max_retries: 1
    # max_retry_wait: 10.0
    # sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times
    concurrent_requests: 1 # the number of parallel inflight requests that may be made
    batch_size: 1 # the number of documents to send in a single request
    batch_max_tokens: 8191 # the maximum number of tokens to send in a single request
    # target: required # or optional

chunks:
  size: 300
  overlap: 100
  group_by_columns: [id] # by default, we don't allow chunks to cross documents

input:
  type: file # or blob
  file_type: text # or csv
  base_dir: "input"
  file_encoding: utf-8
  file_pattern: ".*\\.txt$"

cache:
  type: file # or blob
  base_dir: "cache"
  # connection_string: <azure_blob_storage_connection_string>
  # container_name: <azure_blob_storage_container_name>

storage:
  type: file # or blob
  base_dir: "output/${timestamp}/artifacts"
  # connection_string: <azure_blob_storage_connection_string>
  # container_name: <azure_blob_storage_container_name>

reporting:
  type: file # or console, blob
  base_dir: "output/${timestamp}/reports"
  # connection_string: <azure_blob_storage_connection_string>
  # container_name: <azure_blob_storage_container_name>

entity_extraction:
  ## llm: override the global llm settings for this task
  ## parallelization: override the global parallelization settings for this task
  ## async_mode: override the global async_mode settings for this task
  prompt: "prompts/entity_extraction.txt"
  entity_types: [organization,person,geo,event]
  max_gleanings: 0

summarize_descriptions:
  ## llm: override the global llm settings for this task
  ## parallelization: override the global parallelization settings for this task
  ## async_mode: override the global async_mode settings for this task
  prompt: "prompts/summarize_descriptions.txt"
  max_length: 500

claim_extraction:
  ## llm: override the global llm settings for this task
  ## parallelization: override the global parallelization settings for this task
  ## async_mode: override the global async_mode settings for this task
  # enabled: true
  prompt: "prompts/claim_extraction.txt"
  description: "Any claims or facts that could be relevant to information discovery."
  max_gleanings: 0

community_report:
  ## llm: override the global llm settings for this task
  ## parallelization: override the global parallelization settings for this task
  ## async_mode: override the global async_mode settings for this task
  prompt: "prompts/community_report.txt"
  max_length: 2000
  max_input_length: 7000

cluster_graph:
  max_cluster_size: 10

embed_graph:
  enabled: false # if true, will generate node2vec embeddings for nodes
  # num_walks: 10
  # walk_length: 40
  # window_size: 2
  # iterations: 3
  # random_seed: 597832

umap:
  enabled: false # if true, will generate UMAP embeddings for nodes

snapshots:
  graphml: false
  raw_entities: false
  top_level_nodes: false

local_search:
  # text_unit_prop: 0.5
  # community_prop: 0.1
  # conversation_history_max_turns: 5
  # top_k_mapped_entities: 10
  # top_k_relationships: 10
  # max_tokens: 12000

global_search:
  # max_tokens: 12000
  # data_max_tokens: 12000
  # map_max_tokens: 1000
  # reduce_max_tokens: 2000
  # concurrency: 32

Essentially I use llama3 localy via ollama for the entities and use openai embeddings (much cheaper) until we have a solution to use ollama.

I use your setting and the default text, and do not change any other thing, but still

❌ create_final_entities
None
⠹ GraphRAG Indexer 
├── Loading Input (InputFileType.text) - 1 files loaded (0 filtered) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 0:00:00
├── create_base_text_units
├── create_base_extracted_entities
├── create_summarized_entities
├── create_base_entity_graph
└── create_final_entities
❌ Errors occurred during the pipeline run, see logs for more details.

you will need to serve ollama first curl -fsSL https://ollama.com/install.sh | sh ollama pull llama3 ollama serve

adieyal commented 2 weeks ago

Here is my final config. Somehow after VSCode crashed the summary reports started working when I started it again. Here is my final full config that work so far:


encoding_model: cl100k_base
skip_workflows: []
llm:
  api_key: ${GRAPHRAG_API_KEY}
  type: openai_chat # or azure_openai_chat
  model: llama3
  model_supports_json: true # recommended if this is available for your model.
  max_tokens: 2000
  # request_timeout: 180.0
  api_base: http://localhost:11434/v1
  # api_version: 2024-02-15-preview
  # organization: <organization_id>
  # deployment_name: <azure_model_deployment_name>
  # tokens_per_minute: 150_000 # set a leaky bucket throttle
  # requests_per_minute: 10_000 # set a leaky bucket throttle
  max_retries: 1
  # max_retry_wait: 10.0
  # sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times
  concurrent_requests: 1 # the number of parallel inflight requests that may be made

parallelization:
  stagger: 0.3
  # num_threads: 50 # the number of threads to use for parallel processing

async_mode: threaded # or asyncio

embeddings:
  ## parallelization: override the global parallelization settings for embeddings
  async_mode: threaded # or asyncio
  llm:
    api_key: ${GRAPHRAG_API_KEY}
    type: openai_embedding # or azure_openai_embedding
    model: text-embedding-3-small
    api_base: https://api.openai.com/v1
    # api_version: 2024-02-15-preview
    # organization: <organization_id>
    # deployment_name: <azure_model_deployment_name>
    # tokens_per_minute: 150_000 # set a leaky bucket throttle
    # requests_per_minute: 10_000 # set a leaky bucket throttle
    max_retries: 1
    # max_retry_wait: 10.0
    # sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times
    concurrent_requests: 1 # the number of parallel inflight requests that may be made
    batch_size: 1 # the number of documents to send in a single request
    batch_max_tokens: 8191 # the maximum number of tokens to send in a single request
    # target: required # or optional

chunks:
  size: 300
  overlap: 100
  group_by_columns: [id] # by default, we don't allow chunks to cross documents

input:
  type: file # or blob
  file_type: text # or csv
  base_dir: "input"
  file_encoding: utf-8
  file_pattern: ".*\\.txt$"

cache:
  type: file # or blob
  base_dir: "cache"
  # connection_string: <azure_blob_storage_connection_string>
  # container_name: <azure_blob_storage_container_name>

storage:
  type: file # or blob
  base_dir: "output/${timestamp}/artifacts"
  # connection_string: <azure_blob_storage_connection_string>
  # container_name: <azure_blob_storage_container_name>

reporting:
  type: file # or console, blob
  base_dir: "output/${timestamp}/reports"
  # connection_string: <azure_blob_storage_connection_string>
  # container_name: <azure_blob_storage_container_name>

entity_extraction:
  ## llm: override the global llm settings for this task
  ## parallelization: override the global parallelization settings for this task
  ## async_mode: override the global async_mode settings for this task
  prompt: "prompts/entity_extraction.txt"
  entity_types: [organization,person,geo,event]
  max_gleanings: 0

summarize_descriptions:
  ## llm: override the global llm settings for this task
  ## parallelization: override the global parallelization settings for this task
  ## async_mode: override the global async_mode settings for this task
  prompt: "prompts/summarize_descriptions.txt"
  max_length: 500

claim_extraction:
  ## llm: override the global llm settings for this task
  ## parallelization: override the global parallelization settings for this task
  ## async_mode: override the global async_mode settings for this task
  # enabled: true
  prompt: "prompts/claim_extraction.txt"
  description: "Any claims or facts that could be relevant to information discovery."
  max_gleanings: 0

community_report:
  ## llm: override the global llm settings for this task
  ## parallelization: override the global parallelization settings for this task
  ## async_mode: override the global async_mode settings for this task
  prompt: "prompts/community_report.txt"
  max_length: 2000
  max_input_length: 7000

cluster_graph:
  max_cluster_size: 10

embed_graph:
  enabled: false # if true, will generate node2vec embeddings for nodes
  # num_walks: 10
  # walk_length: 40
  # window_size: 2
  # iterations: 3
  # random_seed: 597832

umap:
  enabled: false # if true, will generate UMAP embeddings for nodes

snapshots:
  graphml: false
  raw_entities: false
  top_level_nodes: false

local_search:
  # text_unit_prop: 0.5
  # community_prop: 0.1
  # conversation_history_max_turns: 5
  # top_k_mapped_entities: 10
  # top_k_relationships: 10
  # max_tokens: 12000

global_search:
  # max_tokens: 12000
  # data_max_tokens: 12000
  # map_max_tokens: 1000
  # reduce_max_tokens: 2000
  # concurrency: 32

Essentially I use llama3 localy via ollama for the entities and use openai embeddings (much cheaper) until we have a solution to use ollama.

I use your setting and the default text, and do not change any other thing, but still

❌ create_final_entities
None
⠹ GraphRAG Indexer 
├── Loading Input (InputFileType.text) - 1 files loaded (0 filtered) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 0:00:00
├── create_base_text_units
├── create_base_extracted_entities
├── create_summarized_entities
├── create_base_entity_graph
└── create_final_entities
❌ Errors occurred during the pipeline run, see logs for more details.

you will need to serve ollama first curl -fsSL https://ollama.com/install.sh | sh ollama pull llama3 ollama serve

I'm encountering the same error - my ollama is set-up and was used in the initial steps (nvidia-smi showed gpu usage). This error pops up around 20 minutes in.

EDIT I forgot to edit the api_base under the embeddings section of settings.yaml - it was trying to access embeddings at the ollama endpoint

bmaltais commented 2 weeks ago

You can’t currently use ollama to do the embeddings… this is why it fails.

adieyal commented 2 weeks ago

You can’t currently use ollama to do the embeddings… this is why it fails.

thanks - just noticed this - working now

miludedeng commented 2 weeks ago

embeddings:
  ## parallelization: override the global parallelization settings for embeddings
  async_mode: threaded # or asyncio
  llm:
    api_key: ${GRAPHRAG_API_KEY}
    type: openai_embedding # or azure_openai_embedding
    model: nomic-embed-text
    api_base:  http://localhost:11434/api/  # 使用ollama 运行nomic

It is my config, but got ERROR Error executing verb "text_embed" in create_final_entities: 'NoneType' object is not iterable Traceback (most recent call last): File "/opt/homebrew/Caskroom/miniconda/base/envs/graphrag/lib/python3.11/site-packages/datashaper/workflow/workflow.py", line 415, in _execute_verb result = await result ^^^^^^^^^^^^ File "/opt/homebrew/Caskroom/miniconda/base/envs/graphrag/lib/python3.11/site-packages/graphrag/index/verbs/text/embed/text_embed.py", line 105, in text_embed return await _text_embed_in_memory( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/homebrew/Caskroom/miniconda/base/envs/graphrag/lib/python3.11/site-packages/graphrag/index/verbs/text/embed/text_embed.py", line 130, in _text_embed_in_memory result = await strategy_exec(texts, callbacks, cache, strategy_args) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/homebrew/Caskroom/miniconda/base/envs/graphrag/lib/python3.11/site-packages/graphrag/index/verbs/text/embed/strategies/openai.py", line 61, in run embeddings = await _execute(llm, text_batches, ticker, semaphore) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/homebrew/Caskroom/miniconda/base/envs/graphrag/lib/python3.11/site-packages/graphrag/index/verbs/text/embed/strategies/openai.py", line 105, in _execute results = await asyncio.gather(*futures) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/homebrew/Caskroom/miniconda/base/envs/graphrag/lib/python3.11/site-packages/graphrag/index/verbs/text/embed/strategies/openai.py", line 99, in embed chunk_embeddings = await llm(chunk) ^^^^^^^^^^^^^^^^ File "/opt/homebrew/Caskroom/miniconda/base/envs/graphrag/lib/python3.11/site-packages/graphrag/llm/base/caching_llm.py", line 104, in __call__ result = await self._delegate(input, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/homebrew/Caskroom/miniconda/base/envs/graphrag/lib/python3.11/site-packages/graphrag/llm/base/rate_limiting_llm.py", line 177, in __call__ result, start = await execute_with_retry() ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/homebrew/Caskroom/miniconda/base/envs/graphrag/lib/python3.11/site-packages/graphrag/llm/base/rate_limiting_llm.py", line 159, in execute_with_retry async for attempt in retryer: File "/opt/homebrew/Caskroom/miniconda/base/envs/graphrag/lib/python3.11/site-packages/tenacity/asyncio/__init__.py", line 166, in __anext__ do = await self.iter(retry_state=self._retry_state) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/homebrew/Caskroom/miniconda/base/envs/graphrag/lib/python3.11/site-packages/tenacity/asyncio/__init__.py", line 153, in iter result = await action(retry_state) ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/homebrew/Caskroom/miniconda/base/envs/graphrag/lib/python3.11/site-packages/tenacity/_utils.py", line 99, in inner return call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/opt/homebrew/Caskroom/miniconda/base/envs/graphrag/lib/python3.11/site-packages/tenacity/__init__.py", line 398, in <lambda> self._add_action_func(lambda rs: rs.outcome.result()) ^^^^^^^^^^^^^^^^^^^ File "/opt/homebrew/Caskroom/miniconda/base/envs/graphrag/lib/python3.11/concurrent/futures/_base.py", line 449, in result return self.__get_result() ^^^^^^^^^^^^^^^^^^^ File "/opt/homebrew/Caskroom/miniconda/base/envs/graphrag/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result raise self._exception File "/opt/homebrew/Caskroom/miniconda/base/envs/graphrag/lib/python3.11/site-packages/graphrag/llm/base/rate_limiting_llm.py", line 165, in execute_with_retry return await do_attempt(), start ^^^^^^^^^^^^^^^^^^ File "/opt/homebrew/Caskroom/miniconda/base/envs/graphrag/lib/python3.11/site-packages/graphrag/llm/base/rate_limiting_llm.py", line 147, in do_attempt return await self._delegate(input, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/homebrew/Caskroom/miniconda/base/envs/graphrag/lib/python3.11/site-packages/graphrag/llm/base/base_llm.py", line 49, in __call__ return await self._invoke(input, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/homebrew/Caskroom/miniconda/base/envs/graphrag/lib/python3.11/site-packages/graphrag/llm/base/base_llm.py", line 53, in _invoke output = await self._execute_llm(input, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/homebrew/Caskroom/miniconda/base/envs/graphrag/lib/python3.11/site-packages/graphrag/llm/openai/openai_embeddings_llm.py", line 36, in _execute_llm embedding = await self.client.embeddings.create( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/homebrew/Caskroom/miniconda/base/envs/graphrag/lib/python3.11/site-packages/openai/resources/embeddings.py", line 215, in create return await self._post( ^^^^^^^^^^^^^^^^^ File "/opt/homebrew/Caskroom/miniconda/base/envs/graphrag/lib/python3.11/site-packages/openai/_base_client.py", line 1816, in post return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/homebrew/Caskroom/miniconda/base/envs/graphrag/lib/python3.11/site-packages/openai/_base_client.py", line 1514, in request return await self._request( ^^^^^^^^^^^^^^^^^^^^ File "/opt/homebrew/Caskroom/miniconda/base/envs/graphrag/lib/python3.11/site-packages/openai/_base_client.py", line 1612, in _request return await self._process_response( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/homebrew/Caskroom/miniconda/base/envs/graphrag/lib/python3.11/site-packages/openai/_base_client.py", line 1704, in _process_response return await api_response.parse() ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/homebrew/Caskroom/miniconda/base/envs/graphrag/lib/python3.11/site-packages/openai/_response.py", line 419, in parse parsed = self._options.post_parser(parsed) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/homebrew/Caskroom/miniconda/base/envs/graphrag/lib/python3.11/site-packages/openai/resources/embeddings.py", line 203, in parser for embedding in obj.data: TypeError: 'NoneType' object is not iterable

hemangjoshi37a commented 2 weeks ago

has anyone got it working with deepseek coder V2 ? if yes please let me know,

Ravikumaryadav22 commented 2 weeks ago

Screenshot 2024-07-08 194340

Ravikumaryadav22 commented 2 weeks ago

please anyone help me to resolve this error

vamshi-rvk commented 2 weeks ago

this worked for me https://github.com/TheAiSingularity/graphrag-local-ollama

AlonsoGuevara commented 2 weeks ago

I'm making this thread as our official discussion place for Ollama and other OSS models set up. A bit redundant with #345 but both have great quality discussions!

Thanks for all the support and enthusiasm!

shreyn07 commented 2 weeks ago

❌ create_final_community_reports None ⠙ GraphRAG Indexer ├── Loading Input (InputFileType.text) - 1 files loaded (0 filtered) 100% ├── create_base_text_units ├── create_base_extracted_entities ├── create_summarized_entities ├── create_base_entity_graph ├── create_final_entities ├── create_final_nodes ├── create_final_communities ├── join_text_units_to_entity_ids ├── create_final_relationships ├── join_text_units_to_relationship_ids └── create_final_community_reports ❌ Errors occurred during the pipeline run, see logs for more details.

C:\Users\shrnema\Downloads\graphrag-main\GRAPHRAG>

File "C:\Users\shrnema\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\core\indexes\range.py", line 417, in get_loc raise KeyError(key) KeyError: 'community' 11:17:11,578 graphrag.index.reporting.file_workflow_callbacks INFO Error running pipeline! details=None

Please help me to fix this

Ayan-sh03 commented 2 weeks ago

https://github.com/michaelfeil/infinity you could use this server, this seems to work perfectly

Lucidology commented 2 weeks ago

this worked for me https://github.com/TheAiSingularity/graphrag-local-ollama

This is the solution!!!

Csaba8472 commented 2 weeks ago

For some reason local models on windows only works with python 3.10. At least for me. 3.12 gave the error above.

shreyn07 commented 2 weeks ago

After changing python does it works for u?

itchenfei commented 2 weeks ago

g python does it wo

After switching to Python version 3.10.6, I encountered the same problem.

94ysc commented 1 week ago

from llama_index.embeddings.ollama import OllamaEmbedding
from pydantic import BaseModel

app = FastAPI()

class Item(BaseModel):
    encoding_format: str
    input: list
    model: str

@app.post("/v1/embeddings")
async def root(item: Item):
    ollama_embedding = OllamaEmbedding(
        model_name=item.model,
        base_url="http://localhost:11434",
        ollama_additional_kwargs={"mirostat": 0},
    )

    pass_embedding = ollama_embedding.get_text_embedding_batch(
        [str(x) for x in item.input], show_progress=True
    )
    data = []
    index = 0
    for embedding in pass_embedding:
        data.append(
            {
                "object": "embedding",
                "index": index,
                "embedding": embedding
            }
        )
        index = index + 1
    return {
        "object": "list",
        "data": data,
        "model": "text-embedding-3-small",
        "usage": {
            "prompt_tokens": 5,
            "total_tokens": 5
        }
    }

我的方案是，用fastapi给ollama增加一个适配服务

s106916 commented 1 week ago

this is a temp hacked solution for ollama https://github.com/s106916/graphrag

MarkJGx commented 1 week ago

I have a development version of Ollama v1/embeddings OpenAI-like API working. I am currently inferencing mxbai-embed-large through Ollama on macOS (M1 Max, 32GB).

It's https://github.com/ollama/ollama/pull/5285 rebased on https://github.com/ollama/ollama/pull/5285. I followed the development compilation instructions outlined here: https://github.com/ollama/ollama/blob/main/docs/development.md

My rebased branch is available here: https://github.com/MarkJGx/ollama/

dengkeshun commented 1 week ago

Does anyone face that Global query works but local query doesn't work?

Ravikumaryadav22 commented 1 week ago

Here is my final config. Somehow after VSCode crashed the summary reports started working when I started it again. Here is my final full config that work so far:


encoding_model: cl100k_base
skip_workflows: []
llm:
  api_key: ${GRAPHRAG_API_KEY}
  type: openai_chat # or azure_openai_chat
  model: llama3
  model_supports_json: true # recommended if this is available for your model.
  max_tokens: 2000
  # request_timeout: 180.0
  api_base: http://localhost:11434/v1
  # api_version: 2024-02-15-preview
  # organization: <organization_id>
  # deployment_name: <azure_model_deployment_name>
  # tokens_per_minute: 150_000 # set a leaky bucket throttle
  # requests_per_minute: 10_000 # set a leaky bucket throttle
  max_retries: 1
  # max_retry_wait: 10.0
  # sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times
  concurrent_requests: 1 # the number of parallel inflight requests that may be made

parallelization:
  stagger: 0.3
  # num_threads: 50 # the number of threads to use for parallel processing

async_mode: threaded # or asyncio

embeddings:
  ## parallelization: override the global parallelization settings for embeddings
  async_mode: threaded # or asyncio
  llm:
    api_key: ${GRAPHRAG_API_KEY}
    type: openai_embedding # or azure_openai_embedding
    model: text-embedding-3-small
    api_base: https://api.openai.com/v1
    # api_version: 2024-02-15-preview
    # organization: <organization_id>
    # deployment_name: <azure_model_deployment_name>
    # tokens_per_minute: 150_000 # set a leaky bucket throttle
    # requests_per_minute: 10_000 # set a leaky bucket throttle
    max_retries: 1
    # max_retry_wait: 10.0
    # sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times
    concurrent_requests: 1 # the number of parallel inflight requests that may be made
    batch_size: 1 # the number of documents to send in a single request
    batch_max_tokens: 8191 # the maximum number of tokens to send in a single request
    # target: required # or optional

chunks:
  size: 300
  overlap: 100
  group_by_columns: [id] # by default, we don't allow chunks to cross documents

input:
  type: file # or blob
  file_type: text # or csv
  base_dir: "input"
  file_encoding: utf-8
  file_pattern: ".*\\.txt$"

cache:
  type: file # or blob
  base_dir: "cache"
  # connection_string: <azure_blob_storage_connection_string>
  # container_name: <azure_blob_storage_container_name>

storage:
  type: file # or blob
  base_dir: "output/${timestamp}/artifacts"
  # connection_string: <azure_blob_storage_connection_string>
  # container_name: <azure_blob_storage_container_name>

reporting:
  type: file # or console, blob
  base_dir: "output/${timestamp}/reports"
  # connection_string: <azure_blob_storage_connection_string>
  # container_name: <azure_blob_storage_container_name>

entity_extraction:
  ## llm: override the global llm settings for this task
  ## parallelization: override the global parallelization settings for this task
  ## async_mode: override the global async_mode settings for this task
  prompt: "prompts/entity_extraction.txt"
  entity_types: [organization,person,geo,event]
  max_gleanings: 0

summarize_descriptions:
  ## llm: override the global llm settings for this task
  ## parallelization: override the global parallelization settings for this task
  ## async_mode: override the global async_mode settings for this task
  prompt: "prompts/summarize_descriptions.txt"
  max_length: 500

claim_extraction:
  ## llm: override the global llm settings for this task
  ## parallelization: override the global parallelization settings for this task
  ## async_mode: override the global async_mode settings for this task
  # enabled: true
  prompt: "prompts/claim_extraction.txt"
  description: "Any claims or facts that could be relevant to information discovery."
  max_gleanings: 0

community_report:
  ## llm: override the global llm settings for this task
  ## parallelization: override the global parallelization settings for this task
  ## async_mode: override the global async_mode settings for this task
  prompt: "prompts/community_report.txt"
  max_length: 2000
  max_input_length: 7000

cluster_graph:
  max_cluster_size: 10

embed_graph:
  enabled: false # if true, will generate node2vec embeddings for nodes
  # num_walks: 10
  # walk_length: 40
  # window_size: 2
  # iterations: 3
  # random_seed: 597832

umap:
  enabled: false # if true, will generate UMAP embeddings for nodes

snapshots:
  graphml: false
  raw_entities: false
  top_level_nodes: false

local_search:
  # text_unit_prop: 0.5
  # community_prop: 0.1
  # conversation_history_max_turns: 5
  # top_k_mapped_entities: 10
  # top_k_relationships: 10
  # max_tokens: 12000

global_search:
  # max_tokens: 12000
  # data_max_tokens: 12000
  # map_max_tokens: 1000
  # reduce_max_tokens: 2000
  # concurrency: 32

Essentially I use llama3 localy via ollama for the entities and use openai embeddings (much cheaper) until we have a solution to use ollama.

I use your setting and the default text, and do not change any other thing, but still

❌ create_final_entities
None
⠹ GraphRAG Indexer 
├── Loading Input (InputFileType.text) - 1 files loaded (0 filtered) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 0:00:00
├── create_base_text_units
├── create_base_extracted_entities
├── create_summarized_entities
├── create_base_entity_graph
└── create_final_entities
❌ Errors occurred during the pipeline run, see logs for more details.

you will need to serve ollama first curl -fsSL https://ollama.com/install.sh | sh ollama pull llama3 ollama serve

i followed your step still getting the same error its showing error invoking llm

yurochang commented 6 days ago

"Error Invoking LLM"-- fixed by using LM studio in embedding part.

successfully build the graph, BUT can global search , CAN NOT local search:

ZeroDivisionError: Weights sum to zero, can't be normalized

VamshikrishnaAluwala commented 5 days ago

[4 rows x 10 columns] 🚀 join_text_units_to_relationship_ids id relationship_ids 0 63575d4c37be57321538f1938c2fece6 [dde131ab575d44dbb55289a6972be18f, de9e343f2e3... ❌ create_final_community_reports None ⠏ GraphRAG Indexer ├── Loading Input (InputFileType.text) - 1 files loaded (0 filtered) ━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 0:00:00 ├── create_base_text_units ├── create_base_extracted_entities ├── create_summarized_entities ├── create_base_entity_graph ├── create_final_entities ├── create_final_nodes ├── create_final_communities ├── join_text_units_to_entity_ids ├── create_final_relationships ├── join_text_units_to_relationship_ids └── create_final_community_reports ❌ Errors occurred during the pipeline run, see logs for more details. Error during GraphRAG setup: Command 'python -m graphrag.index --root ./ragtest' returned non-zero exit status 1.

xxll88 commented 3 days ago

❌ create_final_community_reports创建最终社区报告 None 没有一 ⠙ GraphRAG Indexer GraphRAG索引器 ├── Loading Input (InputFileType.text) - 1 files loaded (0 filtered) 100% ──正在加载输入（InputFileType.text）-已加载1个文件（0个已过滤）100% ├── create_base_text_units 创建基本文本单位 ├── create_base_extracted_entities 创建基本提取实体 ├── create_summarized_entities 创建汇总实体 ├── create_base_entity_graph 创建基本实体图 ├── create_final_entities 创建最终实体 ├── create_final_nodes 创建最终节点 ├── create_final_communities 创建最终社区 ├── join_text_units_to_entity_ids 连接文本单元到实体id ├── create_final_relationships 创建最终关系 ├── join_text_units_to_relationship_ids 连接文本单元到关系ID └── create_final_community_reports 创建最终社区报告 ❌ Errors occurred during the pipeline run, see logs for more details. 在管道运行期间发生错误，请参阅日志以了解更多详细信息。

C:\Users\shrnema\Downloads\graphrag-main\GRAPHRAG>C：\Users\shrnema\Downloads\graphrag-main\GRAPHRAG>

File "C:\Users\shrnema\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\core\indexes\range.py", line 417, in get_loc文件“C：\Users\shrnema\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\core\indexes\range.py”，第417行，在get_loc中 raise KeyError(key) raise KeyError（key） KeyError: 'community' KeyError：'community' 11:17:11,578 graphrag.index.reporting.file_workflow_callbacks INFO Error running pipeline! details=None 11：17：11，578 graphrag.index.reporting.file_workflow_callbacks INFO运行管道时出错！详细信息=无

Please help me to fix this请帮我把这个修好

same issue