Open zanderjiang opened 3 weeks ago
Totally agree. create_final_covariates.parquet is no longer available. The document should have been updated to reflect the change
In the documentation at https://microsoft.github.io/graphrag/posts/config/env_vars/, you can see that GRAPHRAG_CLAIM_EXTRACTION_ENABLED
defaults to False
. I believe this is why the create_final_covariates.parquet file is not generated with the default settings.
I'm curious why this setting defaults to False. Is it to avoid too many LLM calls, or is there another reason?
In the documentation at https://microsoft.github.io/graphrag/posts/config/env_vars/, you can see that
GRAPHRAG_CLAIM_EXTRACTION_ENABLED
defaults toFalse
. I believe this is why the create_final_covariates.parquet file is not generated with the default settings.I'm curious why this setting defaults to False. Is it to avoid too many LLM calls, or is there another reason?
I think there are some mistakes in the document. I ran the local search through CLI command and it works perfectly. Inspecting further I found out that you can simply ignore Covariate by setting it to empty string, like in their code at https://github.com/microsoft/graphrag/blob/main/graphrag/query/cli.py
Covariates are optional because they typically take a lot of domain-specific prompt tuning. (We also call these "claims" since they are claimed statements of fact). If they not enabled in the config, the output parquet is not created, and local search should ignore the fact that it is missing.
In the documentation, under the local search notebook, https://microsoft.github.io/graphrag/posts/query/notebooks/local_search_nb/
The standard indexing pipeline no longer creates a create_final_covariates.parquet file.
Need to set the covariates section of the local search context builder to None.