microsoft / graphrag

A modular graph-based Retrieval-Augmented Generation (RAG) system
https://microsoft.github.io/graphrag/
MIT License
17.11k stars 1.61k forks source link

[Bug]: an error happened when I use `cli.py run_local_search` when rerun streamlit app. #947

Closed dinhngoc267 closed 4 days ago

dinhngoc267 commented 4 weeks ago

Do you need to file an issue?

Describe the bug

from graphrag.query.cli import run_local_search
result = run_local_search(root_dir="/home/nld/kgqa_graphrag/ragtest", query=prompt)
cli.py:

def run_local_search(

        query: str,
        data_dir: str = "/home/nld/kgqa_graphrag/ragtest/output/20240815-071015/artifacts",
        root_dir: str = "/home/nld/kgqa_graphrag/ragtest",
        config_dir: str = "/home/nld/kgqa_graphrag/ragtest/settings.yaml",
        community_level: int = 2,
        response_type: str = "Multiple Paragraphs"
)

When streamlit rerun it shows an error:

  File "/HDD/.conda/envs/nld-kgqa-graphrag/lib/python3.10/site-packages/streamlit/runtime/scriptrunner/exec_code.py", line 85, in exec_func_with_error_handling
    result = func()
  File "/HDD/.conda/envs/nld-kgqa-graphrag/lib/python3.10/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 576, in code_to_exec
    exec(code, module.__dict__)
  File "/home/nld/kgqa_graphrag/app.py", line 6, in <module>
    from graphrag.query.cli import run_local_search
  File "/home/nld/kgqa_graphrag/graphrag/query/cli.py", line 17, in <module>
    from graphrag.index.progress import PrintProgressReporter
  File "/home/nld/kgqa_graphrag/graphrag/index/__init__.py", line 40, in <module>
    from .run import run_pipeline, run_pipeline_with_config
  File "/home/nld/kgqa_graphrag/graphrag/index/run.py", line 59, in <module>
    from .verbs import *  # noqa
  File "/home/nld/kgqa_graphrag/graphrag/index/verbs/__init__.py", line 6, in <module>
    from .covariates import extract_covariates
  File "/home/nld/kgqa_graphrag/graphrag/index/verbs/covariates/__init__.py", line 6, in <module>
    from .extract_covariates import extract_covariates
  File "/home/nld/kgqa_graphrag/graphrag/index/verbs/covariates/extract_covariates/__init__.py", line 6, in <module>
    from .extract_covariates import ExtractClaimsStrategyType, extract_covariates
  File "/home/nld/kgqa_graphrag/graphrag/index/verbs/covariates/extract_covariates/extract_covariates.py", line 41, in <module>
    async def extract_covariates(
  File "/HDD/.conda/envs/nld-kgqa-graphrag/lib/python3.10/site-packages/datashaper/engine/verbs/verbs_mapping.py", line 36, in inner
    VerbManager.get().register(verb, override_existing)
  File "/HDD/.conda/envs/nld-kgqa-graphrag/lib/python3.10/site-packages/datashaper/engine/verbs/verbs_mapping.py", line 66, in register
    raise VerbAlreadyRegisteredError(verb.name)
datashaper.errors.VerbAlreadyRegisteredError: Verb extract_covariates already registered.

Steps to reproduce

No response

Expected Behavior

No response

GraphRAG Config Used

# Paste your config here

Logs and screenshots

No response

Additional Information

No response

natoverse commented 4 weeks ago

Streamlit re-runs the script every time anything on the page changes, so your code may be getting executed too often. We generally use a "loaded" st.session_state variable to do run-once things, and then only execute methods like run_local_search in response to events such as clicked = st.button(...).