microsoft / graphrag

A modular graph-based Retrieval-Augmented Generation (RAG) system
https://microsoft.github.io/graphrag/
MIT License
17.8k stars 1.7k forks source link

Unknown workflow: entity_extraction when running graphrag\examples\entity_extraction\with_graph_intelligence\run.py #353

Closed thomasjlittle closed 1 month ago

thomasjlittle commented 3 months ago

Hi, sorry if I have overlooked a simple solution to this, but I am running into the following error when trying to run the example scripts for entity extraction with graph intelligence.

Example file: entity_extraction when running graphrag\examples\entity_extraction\with_graph_intelligence\run.py

Exception has occurred: UnknownWorkflowError Unknown workflow: entity_extraction File "C:\Users\user\dev\MSGraphRag\graphrag\examples\entity_extraction\with_graph_intelligence\run.py", line 95, in run_python async for table in run_pipeline(dataset=dataset, workflows=workflows): File "C:\Users\user\dev\MSGraphRag\graphrag\examples\entity_extraction\with_graph_intelligence\run.py", line 107, in asyncio.run(run_python()) graphrag.index.errors.UnknownWorkflowError: Unknown workflow: entity_extraction

it looks like the 'entity_extraction' workflow is not in the default_workflows.py file or in the workflows.v1 folder either

martinpenchev commented 3 months ago

I guess they changed the names. I got it working like this:

workflows: list[PipelineWorkflowReference] = [
    PipelineWorkflowReference(
        name="create_base_extracted_entities",
        config={
            "entity_extract": {
                "strategy": {
                    "type": "nltk",
                }
            }
        },
    ),
    PipelineWorkflowReference(
        name="create_base_entity_graph",
        config={
            "cluster_graph": {"strategy": {"type": "leiden"}},
            "embed_graph": {
                "strategy": {
                    "type": "node2vec",
                    "num_walks": 10,
                    "walk_length": 40,
                    "window_size": 2,
                    "iterations": 3,
                    "random_seed": 597832,
                }
            },
            "layout_graph": {
                "strategy": {
                    "type": "umap",
                },
            },
        },
    ),
]
github-actions[bot] commented 2 months ago

This issue has been marked stale due to inactivity after repo maintainer or community member responses that request more information or suggest a solution. It will be closed after five additional days.

TudorAndrei commented 1 month ago

It also doesn't work for me. Even with martinpenchev's code snippet.

vinayybhore commented 1 month ago

me too!

image
github-actions[bot] commented 1 month ago

This issue has been marked stale due to inactivity after repo maintainer or community member responses that request more information or suggest a solution. It will be closed after five additional days.

xinzheng99 commented 1 month ago

I guess they changed the names. I got it working like this:

workflows: list[PipelineWorkflowReference] = [
    PipelineWorkflowReference(
        name="create_base_extracted_entities",
        config={
            "entity_extract": {
                "strategy": {
                    "type": "nltk",
                }
            }
        },
    ),
    PipelineWorkflowReference(
        name="create_base_entity_graph",
        config={
            "cluster_graph": {"strategy": {"type": "leiden"}},
            "embed_graph": {
                "strategy": {
                    "type": "node2vec",
                    "num_walks": 10,
                    "walk_length": 40,
                    "window_size": 2,
                    "iterations": 3,
                    "random_seed": 597832,
                }
            },
            "layout_graph": {
                "strategy": {
                    "type": "umap",
                },
            },
        },
    ),
]

If you have a suitable CSV file could you provide a copy? Because I don't find the input csv file for that program in the code. Thank you very much. @martinpenchev

github-actions[bot] commented 1 month ago

This issue has been marked stale due to inactivity after repo maintainer or community member responses that request more information or suggest a solution. It will be closed after five additional days.

github-actions[bot] commented 1 month ago

This issue has been closed after being marked as stale for five days. Please reopen if needed.