Azure-Samples / azure-search-openai-demo

A sample app for the Retrieval-Augmented Generation pattern running in Azure, using Azure AI Search for retrieval and Azure OpenAI large language models to power ChatGPT-style and Q&A experiences.
https://azure.microsoft.com/products/search
MIT License
6.02k stars 4.12k forks source link

Documentation Request: Clarify Backend API Calls for Chat Implementation #2059

Open sam-h-long opened 14 hours ago

sam-h-long commented 14 hours ago

Please provide us with the following information:

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [ ] feature request
- [x] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Similar to #836 I am curious about the backend api calls that are needed to create a chat experience. I started by running the azd command steps in the deploy part of the README.md to copy the code locally and set credentials.

Next within the following directory (azure-search-openai-demo/app/backend/) I created a virtual Python env:

backend % /Users/sl/.pyenv/versions/3.11.10/bin/python -m venv .venv_azure_search_demo_v1
backend % source .venv_azure_search_demo_v1/bin/activate
backend % pip install -r requirements.txt
backend % pip install gunicorn
backend % pip install ddtrace

Lastly, ran quart to get the backend to run locally:

backend % quart --app main:app run --port "50505" --host "localhost" --reload

Any log messages given by the failure

NA

Expected/desired behavior

Calling the chat/endpoint the following way: (unsure if my overrides are redundant..?):

import httpx
import asyncio

async def call_chat_endpoint():
    url = 'http://127.0.0.1:50505/chat'
    data = {
        "context": {"overrides": {"retrieval_mode": "text"}
                    # Include any necessary context here
                    },
        "messages": [
            {"role": "user", "content": "Where is Paris?"}
        ],
        "session_state": {}  # Include any session state if needed
    }

    headers = {
        'Content-Type': 'application/json',
        'Authorization': 'Bearer YOUR_AUTH_TOKEN',  # If you need to include a token
    }

    async with httpx.AsyncClient() as client:
        response = await client.post(url, json=data, headers=headers)
        print(response.json())  # Process the response

# Run the async function
asyncio.run(call_chat_endpoint())

Tracing the application with Datadog I am pretty sure the format of the response comes from run_without_streaming().

        chat_app_response = {
            "message": {"content": "Paris is in France and known for its rich history.", "role": "assistant"},
            "context": extra_info,
            "session_state": session_state,
        }

If so, in the frontend my guess implementation is just the previous "message" is passed to the next chat/ call before the next question? For example,

    data = {
        "context": {"overrides": {"retrieval_mode": "text"}},
        "messages": [
            {"role": "user", "content": "Where is Paris?"},
            {"content": "Paris is in France and known for its rich history.", "role": "assistant"},
            {"role": "user", "content": "What history are you referring to?"},
        ],
        "session_state": {}  # Include any session state if needed
    }

In other words, I am trying to understand if the "context" or "session_state" output from run_without_streaming() would get passed to the next chat/ call as well? Any further documentation of APIs needed to simulate a chat experience would be great!

OS and Version?

macOS Sonoma

azd version?

run azd version

azd version 1.10.3 (commit 0595f33fe948ee6df3da492567e3e7943cb9a733)

Versions

NA

Mention any other details that might be useful

While some additional documentation on the prompt sequences would be useful, overall this is an a really great repository. In contrast, to sample-app-aoai-chatGPT I have found understanding the calls made to the Search Client & Azure OpenAI APIs much easier to follow. Great work and thank you 🙌 🙌

pamelafox commented 4 hours ago

The backend follows the protocol described here: https://github.com/microsoft/ai-chat-protocol/tree/main/spec#microsoft-ai-chat-protocol-api-specification-version-2024-05-29

Let me know if you have additional questions after reading that.