Azure-Samples / chat-with-your-data-solution-accelerator

A Solution Accelerator for the RAG pattern running in Azure, using Azure AI Search for retrieval and Azure OpenAI large language models to power ChatGPT-style and Q&A experiences. This includes most common requirements and best practices.
https://azure.microsoft.com/products/search
MIT License
824 stars 426 forks source link

Not able to implement hybrid search(semantic and vector). #517

Closed pranav-saji closed 5 months ago

pranav-saji commented 7 months ago

Describe the bug

Not able to implement hybrid search(semantic and vector). Only vector search is present. I believe because of this, the chat is not full functional.The chatbot is not giving precise answers. I believe its because of the lack of hybrid search.

Screenshots

Screen Shot 2024-03-22 at 12 15 14 AM
cecheta commented 7 months ago

Hi @PFA23SCM89S , thank you for raising this bug. Please could you provide us with the steps to reproduce this bug and what we should expect to see vs what actually happens? Thank you.

cherifbenham commented 7 months ago

question answer tool is not dynamic regarding search and top k

image

pranav-saji commented 7 months ago

@cecheta To reproduce this error, just upload any document in admin webpage and ask specific questions from the document, preferably a bigger prompt. Not sure how to fix this. @cherifbenham Any idea on what to change in the code?

cherifbenham commented 7 months ago

@PFA23SCM89S in your case, if you deployed from devcontainer without changing the code, then your search should be hybrid meaning text+vector that outputs the 4 closest docs (or less)

the bug i am trying to fix relates to the implementation of semantic+hybrid, i found a way to do it using semantic config

semantic_config = SemanticConfiguration(name="semantic_config", prioritized_fields=SemanticPrioritizedFields(title_field=SemanticField(field_name="title"),content_fields=["content"], keywords_fields=["title"]")

and then

sources = self.vector_store.similarity_search(query=question, k=10, search_type = "semantic_hybrid", semanticConfiguration="semantic_config")

i am trying to redeploy like this - will let you know how it goes

pranav-saji commented 7 months ago

@cherifbenham Thanks for the update. I tried to update my open ai version from gpt 35 turbo to gpt 4. Still the response i get is very poor. im not sure if the model is updated provide. can u help me with how to update the model properly.

cherifbenham commented 7 months ago

You can try to update the model version from gpt35 to gpt4 in env variables of function apps and in environment variable of web app. In your case, you’ll have more grounded and more developed responses.

You can also modify the system prompt in admin configuration tab to develop its answers more. You can also increase the max tokens output from 1000 to 2000. And finally you can increase your chunk size to get more context from relevant docs. I believe you should also investigate the quality of your chunks from index via rest api or via python sdk

Hope this helps

On Tue 26 Mar 2024 at 19:15, Pranav Saji @.***> wrote:

@cherifbenham https://github.com/cherifbenham Thanks for the update. I tried to update my open ai version from gpt 35 turbo to gpt 4. Still the response i get is very poor. im not sure if the model is updated provide. can u help me with how to update the model properly.

— Reply to this email directly, view it on GitHub https://github.com/Azure-Samples/chat-with-your-data-solution-accelerator/issues/517#issuecomment-2021162731, or unsubscribe https://github.com/notifications/unsubscribe-auth/AXWMKH2IGUFDD4CLQDMHCKDY2G3MNAVCNFSM6AAAAABFCV3Q52VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMRRGE3DENZTGE . You are receiving this because you were mentioned.Message ID: <Azure-Samples/chat-with-your-data-solution-accelerator/issues/517/2021162731 @github.com>

pranav-saji commented 7 months ago

@cherifbenham I had updated the model version like this actually. maybe because of semantic+hybrid. Any update on your deployment?

pranav-saji commented 7 months ago

@cherifbenham Can you please mention which file exactly did u update. can u provide a ss of this if possible

cecheta commented 6 months ago

Hello @PFA23SCM89S , for the /api/conversation/custom endpoint, the application currently supports either vectorSimpleHybrid or vectorSemanticHybrid search, depending on whether a semantic configuration has been supplied or not. For the /api/conversation/azure_byod endpoint, the application currently supports either simple or semantic.

May I know which endpoint you are currently using? If you are using the custom endpoint, you should be able to use semantic + vector search by configuring the AZURE_SEARCH_USE_SEMANTIC_SEARCH and AZURE_SEARCH_SEMANTIC_SEARCH_CONFIG environment variables. If you are using the azure_byod endpoint, we plan to add more search configurations in https://github.com/Azure-Samples/chat-with-your-data-solution-accelerator/issues/295

ross-p-smith commented 6 months ago

@PFA23SCM89S - Do you have any response from the comment above? We will have to close this bug soon

superhindupur commented 5 months ago

@gaurarpit and I investigated this and verified that hybrid search is being used for the /api/conversation/custom endpoint. As there is no further response from the user, we will be closing this issue.