Closed John-Nuos closed 2 months ago
Please Note I have tried the auto tuning using two followings but it didn't work.
python -m graphrag.prompt_tune --root /path/to/project --domain "Microbiology" --method random --limit 10 --language Myanmar --max-tokens 2048 --chunk-size 256 --no-entity-types --output /path/to/output
python -m graphrag.prompt_tune --root /path/to/project --domain "Microbiology" --method random --limit 10 --language Burmese --max-tokens 2048 --chunk-size 256 --no-entity-types --output /path/to/output
It looks like --language field doesn't work, try without it instead.
Hi @John-Nuos What version were you using? We just released 0.2.0, which includes the --language parameter. If it was on 0.1.1 that would've failed
thanks for your reply. I will check tomorrow if I am using latest version. I just pip install to my virtual environment.
On Thu, Jul 25, 2024 at 11:56 PM Alonso Guevara @.***> wrote:
Hi @John-Nuos https://github.com/John-Nuos What version were you using? We just released 0.2.0, which includes the --language parameter. If it was on 0.1.1 that would've failed
— Reply to this email directly, view it on GitHub https://github.com/microsoft/graphrag/issues/706#issuecomment-2250980257, or unsubscribe https://github.com/notifications/unsubscribe-auth/BIC5KKZTHUGC2K472MD3WMLZOEU5BAVCNFSM6AAAAABLN2RG3WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENJQHE4DAMRVG4 . You are receiving this because you were mentioned.Message ID: @.***>
Hey, you can add the system prompt to your Local Search function, e.g:
search_engine = LocalSearch( llm=llm, context_builder=context_builder, system_prompt = LOCAL_SEARCH_SYSTEM_PROMPT, token_encoder=token_encoder, llm_params=llm_params, context_builder_params=local_context_params, response_type="multiple paragraphs", )
The default prompt looks like this, but you can simply alter it to produce answers in your preferred language:
`"""Local search system prompts."""
LOCAL_SEARCH_SYSTEM_PROMPT = """ ---Role---
You are a helpful assistant responding to questions about data in the tables provided.
---Goal---
Generate a response of the target length and format that responds to the user's question, summarizing all information in the input data tables appropriate for the response length and format, and incorporating any relevant general knowledge.
If you don't know the answer, just say so. Do not make anything up.
Points supported by data should list their data references as follows:
"This is an example sentence supported by multiple data references [Data:
Do not list more than 5 record ids in a single reference. Instead, list the top 5 most relevant record ids and add "+more" to indicate that there are more.
For example:
"Person X is the owner of Company Y and subject to many allegations of wrongdoing [Data: Sources (15, 16), Reports (1), Entities (5, 7); Relationships (23); Claims (2, 7, 34, 46, 64, +more)]."
where 15, 16, 1, 5, 7, 23, 2, 7, 34, 46, and 64 represent the id (not the index) of the relevant data record.
Do not include information where the supporting evidence for it is not provided.
---Target response length and format---
{response_type}
---Data tables---
{context_data}
---Goal---
Generate a response of the target length and format that responds to the user's question, summarizing all information in the input data tables appropriate for the response length and format, and incorporating any relevant general knowledge.
If you don't know the answer, just say so. Do not make anything up.
Points supported by data should list their data references as follows:
"This is an example sentence supported by multiple data references [Data:
Do not list more than 5 record ids in a single reference. Instead, list the top 5 most relevant record ids and add "+more" to indicate that there are more.
For example:
"Person X is the owner of Company Y and subject to many allegations of wrongdoing [Data: Sources (15, 16), Reports (1), Entities (5, 7); Relationships (23); Claims (2, 7, 34, 46, 64, +more)]."
where 15, 16, 1, 5, 7, 23, 2, 7, 34, 46, and 64 represent the id (not the index) of the relevant data record.
Do not include information where the supporting evidence for it is not provided.
---Target response length and format---
{response_type}
Add sections and commentary to the response as appropriate for the length and format. Style the response in markdown. """`
Hey, you can add the system prompt to your Local Search function, e.g:
search_engine = LocalSearch( llm=llm, context_builder=context_builder, system_prompt = LOCAL_SEARCH_SYSTEM_PROMPT, token_encoder=token_encoder, llm_params=llm_params, context_builder_params=local_context_params, response_type="multiple paragraphs", )
The default prompt looks like this, but you can simply alter it to produce answers in your preferred language:
`"""Local search system prompts."""
LOCAL_SEARCH_SYSTEM_PROMPT = """ ---Role---
You are a helpful assistant responding to questions about data in the tables provided.
---Goal---
Generate a response of the target length and format that responds to the user's question, summarizing all information in the input data tables appropriate for the response length and format, and incorporating any relevant general knowledge.
If you don't know the answer, just say so. Do not make anything up.
Points supported by data should list their data references as follows:
"This is an example sentence supported by multiple data references [Data:
(record ids); (record ids)]." Do not list more than 5 record ids in a single reference. Instead, list the top 5 most relevant record ids and add "+more" to indicate that there are more.
For example:
"Person X is the owner of Company Y and subject to many allegations of wrongdoing [Data: Sources (15, 16), Reports (1), Entities (5, 7); Relationships (23); Claims (2, 7, 34, 46, 64, +more)]."
where 15, 16, 1, 5, 7, 23, 2, 7, 34, 46, and 64 represent the id (not the index) of the relevant data record.
Do not include information where the supporting evidence for it is not provided.
---Target response length and format---
{response_type}
---Data tables---
{context_data}
---Goal---
Generate a response of the target length and format that responds to the user's question, summarizing all information in the input data tables appropriate for the response length and format, and incorporating any relevant general knowledge.
If you don't know the answer, just say so. Do not make anything up.
Points supported by data should list their data references as follows:
"This is an example sentence supported by multiple data references [Data:
(record ids); (record ids)]." Do not list more than 5 record ids in a single reference. Instead, list the top 5 most relevant record ids and add "+more" to indicate that there are more.
For example:
"Person X is the owner of Company Y and subject to many allegations of wrongdoing [Data: Sources (15, 16), Reports (1), Entities (5, 7); Relationships (23); Claims (2, 7, 34, 46, 64, +more)]."
where 15, 16, 1, 5, 7, 23, 2, 7, 34, 46, and 64 represent the id (not the index) of the relevant data record.
Do not include information where the supporting evidence for it is not provided.
---Target response length and format---
{response_type}
Add sections and commentary to the response as appropriate for the length and format. Style the response in markdown. """`
Does Graphrag support adding a system prompt during the graphrag.index
process? When I import documents from different companies, their information gets mixed up. For example, when querying employees or related information of Company A, it returns information about Company B. So I am thinking if it's possible to add some metadata through a system prompt.
Does Graphrag support adding a system prompt during the
graphrag.index
process? When I import documents from different companies, their information gets mixed up. For example, when querying employees or related information of Company A, it returns information about Company B. So I am thinking if it's possible to add some metadata through a system prompt.
There are many different LLM calls in the indexing process so it would take quite a bit of work to make an injectable system prompt for each.
I have the same request. It would be nice to be able to modify LOCAL_SEARCH_SYSTEM_PROMPT, MAP_SYSTEM_PROMPT and REDUCE_SYSTEM_PROMPT the same way I can modify the prompts in the prompts folder that appears after the --init command. This is necessary because my main prompts are not in English, and since the system prompts I described above are in English, LLM sometimes responds in English, which I don't need.
This issue has been marked stale due to inactivity after repo maintainer or community member responses that request more information or suggest a solution. It will be closed after five additional days.
This issue has been closed after being marked as stale for five days. Please reopen if needed.
Is there an existing issue for this?
Describe the issue
Where can I edit the overall system prompt. I want to customize GraphRAG's final output. Example I want the final output to be only in Burmese language.
Steps to reproduce
No response
GraphRAG Config Used
Logs and screenshots
No response
Additional Information