Verba generates an incorrect Azure URL. After correction cannot pass authorization

Description

Verba generates an incorrect URL when sending requests to Azure, resulting in a 404 Not Found error. When the URL is manually corrected to the appropriate format, the response changes to a 401 Unauthorized error. Despite having the correct URL and authorization data in the headers, Verba is unable to pass authorization when connecting to Azure OpenAI.

The default (generated by Verba) URL used is: https://resource_name.openai.azure.com/chat/completions, response is a 404 Not Found error.

After manually correcting the URL to the appropriate format: https://resource_name.openai.azure.com/openai/deployments/openai_model/chat/completions?api-version=2023-05-15, the response changes to 401 Unauthorized error.

.env file: OPENAI_API_TYPE= 'azure'
OPENAI_API_KEY = 'api_key'
OPENAI_BASE_URL = 'https://resource_name.openai.azure.com'
AZURE_OPENAI_RESOURCE_NAME = 'resource_name'
AZURE_OPENAI_EMBEDDING_MODEL = 'text-embedding-3-large'
OPENAI_MODEL = 'openai_model'
HuggingFace_Api_Key = 'api_key_hf'
WEAVIATE_URL_VERBA = 'http://localhost:8080/'

Below is manually changed function in components -> generation -> OpenAIGenerator.py with hardcoded URL (changed lines are in bold):


async def generate_stream(
    self,
    config: dict,
    query: str,
    context: str,
    conversation: list[dict] = [],
):
    system_message = config.get("System Message").value
    model = config.get("Model", {"value": "gpt-3.5-turbo"}).value
    model = 'openai_model'
    openai_key = get_environment(
        config, "API Key", "OPENAI_API_KEY", "No OpenAI API Key found"
    )
    openai_url = get_environment(
        config, "URL", "OPENAI_BASE_URL", "https://api.openai.com/v1"
    )

    messages = self.prepare_messages(query, context, conversation, system_message)

    headers = {
        "Content-Type": "application/json",
        "Authorization": f"Bearer {openai_key}",
    }
    data = {
        "messages": messages,
        "model": model,
        "stream": True,
    }

    async with httpx.AsyncClient() as client:
        async with client.stream(
            "POST",
             f"https://resource_name.openai.azure.com/openai/deployments/openai_model/chat/completions?api-version=2023-05-15", 
            json=data,
            headers=headers,
            timeout=None,
        ) as response:

Below on the bolded line, I have a breakpoint where I receive the responses - 404 or 401. Both in case with the default URL and manually prepared URL (when the response contains error 401), auth parameter is empty. But authorization data already is in the headers - "Bearer {openai_key}". It seems to me that the auth parameter should not be empty.

httpx -> _client.py -> AsyncClient lines 1577-1626


async def stream(
    self,
    method: str,
    url: URLTypes,
    *,
    content: RequestContent | None = None,
    data: RequestData | None = None,
    files: RequestFiles | None = None,
    json: typing.Any | None = None,
    params: QueryParamTypes | None = None,
    headers: HeaderTypes | None = None,
    cookies: CookieTypes | None = None,
    auth: AuthTypes | UseClientDefault = USE_CLIENT_DEFAULT,
    follow_redirects: bool | UseClientDefault = USE_CLIENT_DEFAULT,
    timeout: TimeoutTypes | UseClientDefault = USE_CLIENT_DEFAULT,
    extensions: RequestExtensions | None = None,
) -> typing.AsyncIterator[Response]:
    """
    Alternative to `httpx.request()` that streams the response body
    instead of loading it into memory at once.

    **Parameters**: See `httpx.request`.

    See also: [Streaming Responses][0]

    [0]: /quickstart#streaming-responses
    """
    request = self.build_request(
        method=method,
        url=url,
        content=content,
        data=data,
        files=files,
        json=json,
        params=params,
        headers=headers,
        cookies=cookies,
        timeout=timeout,
        extensions=extensions,
    )
    response = await self.send(
        request=request,
        auth=auth,
        follow_redirects=follow_redirects,
        stream=True,
    ) 
    try:
        yield response
    finally:
        await response.aclose()

Installation

[x] pip install goldenverba
[ ] pip install from source
[ ] Docker installation

If you installed via pip, please specify the version: goldenverba v2.0.0

Weaviate Deployment

[ ] Local Deployment
[x] Docker Deployment
[ ] Cloud Deployment

Configuration

Reader: Default Chunker: Token Embedder: sentence-transformers/all-MiniLM-L6-v2 Retriever: Generator:

Steps to Reproduce

Use the default URL for requests: https://resource_name.openai.azure.com/chat/completions.
Observe the 404 Not Found error.
Manually change the URL to appropriate Azure format: https://resource_name.openai.azure.com/openai/deployments/openai_model/chat/completions?api-version=2023-05-15.
Observe the 401 Unauthorized error.

Note: In components -> embedding -> OpenAIEmbedder.py lines 25-44, I commented out the code because self.get_models() also operates on the wrong URL. I generally don't need Azure Embedded, as I use an embedding model from sentence-transformers. However, Verba still checked the possibility of connecting to the embedding model on Azure, not considering the configuration in the .env file. Due to this checking, the Verba interface (homepage) didn't even start. After commenting out these lines, Verba started, and I was able to import some documents and retrieve chunks, but without using the Generator due to the problem with connection described above.

Additional context

Note: goldenverba v1.0.4 can connect to Azure with the same .env file and configuration (that version used openai lib for communication with Azure. As far as I can see, version 2.0.0 has a different logic).

I actually got Azure OpenAI it to work 🎉. Not the cleanest solution but here for whom it might be interesting:

1. Create a new generator file in "goldenverba > components > generation", called `AzureOpenAIGenerator.py`, containing the following code (replace placeholders with your URL/key):

import os
from dotenv import load_dotenv
from goldenverba.components.interfaces import Generator
from goldenverba.components.types import InputConfig
from goldenverba.components.util import get_environment
import httpx
import json

load_dotenv()

class AzureOpenAIGenerator(Generator):
    """
    Azure OpenAI Generator.
    """

    def __init__(self):
        super().__init__()
        self.name = "AzureOpenAI"
        self.description = "Using Azure OpenAI LLM models to generate answers to queries"
        self.context_window = 10000

        models = ["gpt-4o", "gpt-3.5-turbo"]

        self.config["Model"] = InputConfig(
            type="dropdown",
            value=models[0],
            description="Select an Azure OpenAI Model",
            values=models,
        )

        if os.getenv("AZURE_OPENAI_API_KEY") is None:
            self.config["API Key"] = InputConfig(
                type="password",
                value="<ADD YOUR AZURE API KEY HERE>_",
                description="You can set your Azure OpenAI API Key here or set it as environment variable `AZURE_OPENAI_API_KEY`",
                values=[],
            )
        if os.getenv("AZURE_OPENAI_BASE_URL") is None:
            self.config["URL"] = InputConfig(
                type="text",
                value="https://YOUR_RESOURCE_NAME.openai.azure.com/openai/deployments/YOUR_DEPLOYMENT_NAME",
                description="You can change the Base URL here if needed",
                values=[],
            )

    async def generate_stream(
        self,
        config: dict,
        query: str,
        context: str,
        conversation: list[dict] = [],
    ):
        system_message = config.get("System Message").value
        print(system_message)

        model = config.get("Model", {"value": "gpt-3.5-turbo"}).value
        print(model)

        azure_key = get_environment(
            config, "API Key", "AZURE_OPENAI_API_KEY", "No Azure OpenAI API Key found"
        )
        print(azure_key)

        azure_url = get_environment(
            config, "URL", "AZURE_OPENAI_BASE_URL", "https://YOUR_RESOURCE_NAME.openai.azure.com/openai/deployments/YOUR_DEPLOYMENT_NAME"
        )
        print(azure_url)

        messages = self.prepare_messages(query, context, conversation, system_message)
        print(messages)

        headers = {
            "Content-Type": "application/json",
            "api-key": azure_key,
        }
        data = {
            "messages": messages,
            "model": model,
            "stream": True,
        }

        async with httpx.AsyncClient() as client2:
            async with client2.stream(
                "POST",
                f"{azure_url}/chat/completions?api-version=2023-03-15-preview",
                json=data,
                headers=headers,
                timeout=None,
            ) as response:
                async for line in response.aiter_lines():
                    if line.startswith("data: "):
                        if line.strip() == "data: [DONE]":
                            break
                        json_line = json.loads(line[6:])
                        choice = json_line["choices"][0]
                        if "delta" in choice and "content" in choice["delta"]:
                            yield {
                                "message": choice["delta"]["content"],
                                "finish_reason": choice.get("finish_reason"),
                            }
                        elif "finish_reason" in choice:
                            yield {
                                "message": "",
                                "finish_reason": choice["finish_reason"],
                            }
                    else: print(response)

    def prepare_messages(
        self, query: str, context: str, conversation: list[dict], system_message: str
    ) -> list[dict]:
        messages = [
            {
                "role": "system",
                "content": system_message,
            }
        ]

        for message in conversation:
            messages.append({"role": message.type, "content": message.content})

        messages.append(
            {
                "role": "user",
                "content": f"Answer this query: '{query}' with this provided context: {context}",
            }
        )

        return messages

2. Add the new generator to the manager in "goldenverba > components > managers.py":

add to line 68: from goldenverba.components.generation.AzureOpenAIGenerator import AzureOpenAIGenerator

also to line 110:

    generators = [
        OllamaGenerator(),
        OpenAIGenerator(),
        AnthropicGenerator(),
        CohereGenerator(),
        AzureOpenAIGenerator(),
    ]

and to line 137:

    generators = [
        OpenAIGenerator(),
        AnthropicGenerator(),
        CohereGenerator(),
        AzureOpenAIGenerator(),
    ]

3. If you are (like me) using this behind a proxy, modify the code in "goldenverba > components > managers.py" around ~line 1215:

        import httpx

        async with httpx.AsyncClient(proxy="http://YOUR_PROXYSERVER:YOUR_PORT") as client:
            async for result in self.generators[generator].generate_stream(
                generator_config, query, context, conversation
            ):
                yield result

Don't ask me how exactly this works. I tried adding the proxy in the AsyncClient() in AzureOpenAIGenerator.py but only the solution above finally worked for me.

weaviate / Verba