Closed nits-aidev closed 7 months ago
Can you clarify the "data not available" situation? Is that coming from Azure AI search results? You generally shouldn't see much variation from Azure AI search- the same query should yield the same results each time. However, you can check to see what the generated keyword query is in the "thought process" tab, to see if that changed.
Regardless, we agree that adding seed is a good idea to help with reproducibility, and hope to do that.
Hi Pamela, I have tried the Seed implementation With GPT-3.5 v1106, This help with reproducibility.
To provide a clearer picture of the issue at hand, let me elaborate. We have uploaded 30 PDFs into the system. When querying the system with questions pertaining to individual PDFs, it accurately provides the correct answers about 95% of the time. However, the challenge arises when we pose questions that require aggregating data from multiple PDFs. In such scenarios, we occasionally encounter instances where the system reports that data for one or two points is unavailable.
For instance, assume we have three PDFs containing information on the performance of X, Y, and Z funds. When we inquire about the total assets of these funds individually, the system accurately furnishes the required information. But, when we pose a question that necessitates a comparative analysis or aggregation of data from these PDFs—such as asking for the total assets of X, Y, and Z funds collectively—the response might look something like this: X Fund: 2.5 Million Y Fund: 4.7 Million Z Fund: Data not available
This discrepancy occurs specifically during requests for combined data from multiple sources, leading to partial or incomplete information retrieval.
Here is the Sample Question Asked and Though Process ### Q- Compare fund performance of Vanguard U.S Growth Fund, Franklin Growth Fund and BlackRock Income Fund? A- The fund performance for the mentioned funds is as follows:
### Thought Process
**Original user query**
Compare fund performance of Vanguard U.S Growth Fund, Franklin Growth Fund and BlackRock Income Fund?
Generated search query Vanguard U.S Growth Fund, Franklin Growth Fund, BlackRock Income Fund performance use_semantic_captions: false has_vector: true
Also problem is when OpenAI's services attempt to send request to AI Search it fails sometimes. In order to mitigate the impact of this issue, Can we implement a retry logic in your code (may be for 3 times)?? I hope this can help to fix data not available issue.
In the thought process, do you see any search results for Z Fund? If you don't see any search results for Z Fund, then you might try increasing the number of search results to see if that improves retrieval.
If you haven't already, I put together a doc here about improving answer quality: https://github.com/Azure-Samples/azure-search-openai-demo/blob/main/docs/customization.md#improving-answer-quality
That can help us identify where the actual is, whether its at search stage or chatcompletion stage.
The OpenAI SDK already has built-in logic. I haven't seen a situation where the AI search calls require a retry, can you describe exactly what error you're seeing? Please share the traceback.
This issue is for a: (mark with an
x
)Minimal steps to reproduce
AND
Expected/desired behavior
Accuracy issues with samples screnshots .docx
Mention any other details that might be useful
find this solution on Microsoft website using seed will help to reproduce same response every time but not sure will this fix these issues of not. https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/reproducible-output?tabs=pyton