microsoft / PubSec-Info-Assistant

Information Assistant, built with Azure OpenAI Service, Industry Accelerator
MIT License
266 stars 522 forks source link

Information Assistant web app (rel 1.0) responses include unrelated content in thought process #650

Open Miguel-Ramal opened 2 months ago

Miguel-Ramal commented 2 months ago

This bug report is similar to a previously closed (unresolved) report: "Info Assistant answers questions based on external knowledge #399"

Regardless of the reference files in use, when using the information assistant web app (Release 1.0)... after starting chat with the bot... there are some common content that is not related to our web app settings configuration, but instead seems like something left during install somewhere. The content (issue) can be seen after expanding the "Thought process" of a bot response.

The section highlighted below, the default first assistant and user content regarding "promote energy conservation", is the 'unrelated' bit not in context of any reference file, but instead seems like default sample within bot... (pre-configured to the web app ??)

is there a way we can control/adjust this default system / assistant /user content. I know we can adjust the user and system persona, but it only adds the text into to the default prompt.

In addition, can we adjust the 3072 token length default?

Conversations:

{'role': 'system', 'content': "You are an Azure OpenAI Completion system. Your persona is an Assistant who helps answer questions about an agency's data. Please provide a thorough answer. This means that your answer should be no more than 3072 tokens long.\n User persona is analyst Answer ONLY with the facts listed in the list of sources below in English with citations.If there isn't enough information below, say you don't know and do not give citations. For tabular information return it as an html table. Do not return markdown format.\n Your goal is to provide answers based on the facts listed below in the provided source documents. Avoid making assumptions,generating speculative or generalized information or adding personal opinions.\n \n \n Each source has a file name followed by a pipe character and the actual information.Use square brackets to reference the source, e.g. [info1.txt]. Do not combine sources, list each source separately, e.g. [info1.txt][info2.pdf].\n Never cite the source content using the examples provided in this paragraph that start with info.\n \n Here is how you should answer every question:\n \n -Look for information in the source documents to answer the question in English.\n -If the source document has an answer, please respond with citation.You must include a citation to each document referenced only once when you find answer in source documents. \n -If you cannot find answer in below sources, respond with I am not sure.Do not provide personal opinions or assumptions and do not include citations.\n \n \n \n \n "}

{'role': 'assistant', 'content': 'Several steps are being taken to promote energy conservation including reducing energy consumption, increasing energy efficiency, and increasing the use of renewable energy sources.Citations[File0]'}

{'role': 'user', 'content': 'What steps are being taken to promote energy conservation?'}

{'role': 'assistant', 'content': 'user is looking for information in source documents. Do not provide answers that are not in the source documents'}

{'role': 'user', 'content': 'I am looking for information in source documents'}

image

ArpitaisAn0maly commented 2 months ago

Hello Miguel-Ramal you can change resaponse length-3072 in approach.py. Energy examples are few shot examples used by meta prompt which you can change in chatreadretrieveread.py. Both are configurable.

dayland commented 1 month ago

This issue is marked for closure due to inactivity for 2 weeks. It will be closed in 5 days.

Miguel-Ramal commented 1 month ago

@dayland @ArpitaisAn0maly Thank you for the information on where to look into those .py config files. The bug reported however is that, after having the information assistant rel1.0 deployed (as per instructions)... without any additional customisations... there is a default content that we cannot remove from the chat, and it affects the responses in the bot.

here are the statements found that cannot be removed and have no relationship with the questions asked through the interface: " {'role': 'assistant', 'content': 'Several steps are being taken to promote energy conservation including reducing energy consumption, increasing energy efficiency, and increasing the use of renewable energy sources.Citations[File0]'}

{'role': 'user', 'content': 'What steps are being taken to promote energy conservation?'} "

is there anywhere we can remove that content post-deployment?? or what is the best way to go around it?

dayland commented 3 weeks ago

This cannot be removed post deployment. You would need to alter the prompting in the python code and then re-deploy. This technique of few-shot prompting is valuable to help the model respond the way you want with examples. We recommend you simply update these to example questions that align with your use case rather than simply delete. However, you can remove this if you feel it is necessary.