Azure-Samples / azure-search-openai-demo

A sample app for the Retrieval-Augmented Generation pattern running in Azure, using Azure AI Search for retrieval and Azure OpenAI large language models to power ChatGPT-style and Q&A experiences.
https://azure.microsoft.com/products/search
MIT License
6.27k stars 4.2k forks source link

Reproducible output support to fix response inaccuracy and inconsistency in data retrieval #1344

Closed nits-aidev closed 7 months ago

nits-aidev commented 8 months ago

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [x] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

  1. Issue a query to retrieve specific data from a source (e.g., fund details or any other data set).
  2. Receive a "data not available" message.
  3. Repeat the same query after a short period.
  4. Successfully retrieve the data after not available" message when the query fails to retrieve the expected data.
  5. Retry the same query after a short period and occasionally observe a successful multiple attempts without changing the query.

AND

  1. Request information about specific funds, including figures or personnel involved.
  2. Receive data that does not match the official data retrieval, indicating an inconsistency in data access.
  3. Request information about a specific fund, noting the details documentation of the fund.
  4. At other times, issue a similar query and receive accurate information that aligns with the fund's official details.

Expected/desired behavior

  • Consistent and reliable retrieval of data upon query submission, without the need for multiple attempts to get misrepresenting the fund's details.

Accuracy issues with samples screnshots .docx

Mention any other details that might be useful

find this solution on Microsoft website using seed will help to reproduce same response every time but not sure will this fix these issues of not. https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/reproducible-output?tabs=pyton


Thanks! We'll be in touch soon.

pamelafox commented 8 months ago

Can you clarify the "data not available" situation? Is that coming from Azure AI search results? You generally shouldn't see much variation from Azure AI search- the same query should yield the same results each time. However, you can check to see what the generated keyword query is in the "thought process" tab, to see if that changed.

Regardless, we agree that adding seed is a good idea to help with reproducibility, and hope to do that.

nits-aidev commented 8 months ago

Hi Pamela, I have tried the Seed implementation With GPT-3.5 v1106, This help with reproducibility.

To provide a clearer picture of the issue at hand, let me elaborate. We have uploaded 30 PDFs into the system. When querying the system with questions pertaining to individual PDFs, it accurately provides the correct answers about 95% of the time. However, the challenge arises when we pose questions that require aggregating data from multiple PDFs. In such scenarios, we occasionally encounter instances where the system reports that data for one or two points is unavailable.

For instance, assume we have three PDFs containing information on the performance of X, Y, and Z funds. When we inquire about the total assets of these funds individually, the system accurately furnishes the required information. But, when we pose a question that necessitates a comparative analysis or aggregation of data from these PDFs—such as asking for the total assets of X, Y, and Z funds collectively—the response might look something like this: X Fund: 2.5 Million Y Fund: 4.7 Million Z Fund: Data not available

This discrepancy occurs specifically during requests for combined data from multiple sources, leading to partial or incomplete information retrieval.

Here is the Sample Question Asked and Though Process ### Q- Compare fund performance of Vanguard U.S Growth Fund, Franklin Growth Fund and BlackRock Income Fund? A- The fund performance for the mentioned funds is as follows:

### Thought Process

       **Original user query**

Compare fund performance of Vanguard U.S Growth Fund, Franklin Growth Fund and BlackRock Income Fund?

Generated search query Vanguard U.S Growth Fund, Franklin Growth Fund, BlackRock Income Fund performance use_semantic_captions: false has_vector: true

nits-aidev commented 8 months ago

Also problem is when OpenAI's services attempt to send request to AI Search it fails sometimes. In order to mitigate the impact of this issue, Can we implement a retry logic in your code (may be for 3 times)?? I hope this can help to fix data not available issue.

pamelafox commented 8 months ago

In the thought process, do you see any search results for Z Fund? If you don't see any search results for Z Fund, then you might try increasing the number of search results to see if that improves retrieval.

If you haven't already, I put together a doc here about improving answer quality: https://github.com/Azure-Samples/azure-search-openai-demo/blob/main/docs/customization.md#improving-answer-quality

That can help us identify where the actual is, whether its at search stage or chatcompletion stage.

The OpenAI SDK already has built-in logic. I haven't seen a situation where the AI search calls require a retry, can you describe exactly what error you're seeing? Please share the traceback.