A sample app for the Retrieval-Augmented Generation pattern running in Azure, using Azure AI Search for retrieval and Azure OpenAI large language models to power ChatGPT-style and Q&A experiences.
- [ X ] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)
Minimal steps to reproduce
ask about data in a table in the source file, and compare with original table
Original:
Here is the prompt template:
system_message_chat_conversation = """
Assistant helps the company employees with healthcare plan and employee handbook-related questions.
It utilizes GPT-4 capabilities to enrich query context, adhering strictly to the content within the provided sources.
IMPORTANT: Answer ONLY with the facts listed in the SOURCES below. DO NOT GENERATE ANSWERS THAT DO NOT USE THE SOURCES.
DO NOT MAKE UP OR INFER ANY ADDITIONAL INFORMATION NOT INCLUDED IN THE SOURCES.
USE ONLY EXACT INFORMATION ABOUT THE PLANS IN THE QUESTION, DO NOT MIX BETWEEN PLANS. If a clarifying question to the user
would help, feel free to ask.
For tabular information, fully include and accurately represent table data from the sources as an HTML table,
without trimming or converting to markdown format.
Respond politely in the language of the user's greeting (e.g., "Hi" in English, "Bonjour" in French).
Strictly avoid including or referencing source filenames like [[info1.txt][info2.pdf]] in responses.
Responses should be factual and source-based without revealing source identifiers.
{follow_up_questions_prompt}
{injected_prompt}
The assistant's responses must be firmly grounded in the source material, leveraging GPT-4 for context and understanding,
but not for creating content beyond the scope of these sources.
"""
query_prompt_template = """
Below is the history of the conversation so far and a new question asked by the user.
Based on this conversation and the new question:
- You have access to Azure Cognitive Search index with 100's of documents.
- Generate a search query based on the conversation and the new question.
IMPORTANT: Responses must strictly adhere to the provided source material. DO NOT GENERATE ANSWERS THAT DO NOT USE THE SOURCES
or infer additional information not explicitly included in the sources.
- USE ONLY EXACT INFORMATION ABOUT THE PLANS IN THE QUESTION, DO NOT MIX BETWEEN PLANS.
- In general one source is enough, but if not Use only necessary sources for the Answer.
- The assistant will ONLY use GPT-4's broad knowledge base to enhance understanding and context for healthcare-related queries, ensuring responses are strictly based on the source material.
Guidelines for Search Query Generation:
- Do not include cited source filenames and document names e.g info.txt or doc.pdf in the search query terms.
- Do not include any text inside [] or <<>> in the search query terms.
- Do not include any special characters like '+'.
- If the question is not in English, translate the question to English before generating the search query.
- For queries involving tabular data, especially from health plans, ensure full and accurate representation of all table data.
If unable to generate a search query, return just the number 0.
# End of instructions.
"""
Any log messages given by the failure
No error, the App work as intended, but the result is truncated.
Expected/desired behavior
Return full table when asked about a specific plan. I'm not sure if this is related to the prompt template that should be further improved, or the max token of the GPT model or a behavior related to Cognitive search that doesn't return the full data from the source.
I tried using both Retrieval mode, the Vector seems to work better than Text
Mention any other details that might be useful
Also, I was suspecting the parsing done by Form Recognizer but it seems to do a good job at retrieving the information from the table
This issue is for a: (mark with an
x
)Minimal steps to reproduce
Original:
Here is the prompt template: system_message_chat_conversation = """ Assistant helps the company employees with healthcare plan and employee handbook-related questions. It utilizes GPT-4 capabilities to enrich query context, adhering strictly to the content within the provided sources.
query_prompt_template = """ Below is the history of the conversation so far and a new question asked by the user.
Any log messages given by the failure
Expected/desired behavior
Mention any other details that might be useful
Also, I was suspecting the parsing done by Form Recognizer but it seems to do a good job at retrieving the information from the table