Support Azure GPT-4 Turbo endpoint for image generation and maximum number of tokens

Image generation is supported only for openai providers and model names equal to gpt-4-turbo.

Some examples of this can be seen in the v3.0.79 :

-backend/danswer/chat/process_message.py:

elif tool_cls.__name__ == ImageGenerationTool.__name__:
                dalle_key = None
                if llm and llm.config.api_key and llm.config.model_provider == "openai":
                    dalle_key = llm.config.api_key
                else:
                    llm_providers = fetch_existing_llm_providers(db_session)
                    openai_provider = next(
                        iter(
                            [
                                llm_provider
                                for llm_provider in llm_providers
                                if llm_provider.provider == "openai"
                            ]
                        ),
                        None,
                    )
                    if not openai_provider or not openai_provider.api_key:
                        raise ValueError(
                            "Image generation tool requires an OpenAI API key"
                        )
                    dalle_key = openai_provider.api_key
                tools.append(ImageGenerationTool(api_key=dalle_key))

It can also be seen in web/src/app/admin/assistants/AssistantEditor.tsx:

function checkLLMSupportsImageGeneration(provider: string, model: string) {
  return provider === "openai" && model === "gpt-4-turbo";
}

Azure deployments e.i model is not always gpt-4-turbo as it can follow some deployment conventions. Therefore it will not be recognized neither for the Image Generation nor for the maximum number of tokens using litellm.model_cost function :

def get_llm_max_tokens(
    model_map: dict,
    model_name: str,
    model_provider: str = GEN_AI_MODEL_PROVIDER,
) -> int:
    """Best effort attempt to get the max tokens for the LLM"""
    if GEN_AI_MAX_TOKENS:
        # This is an override, so always return this
        return GEN_AI_MAX_TOKENS

    try:
        model_obj = model_map.get(f"{model_provider}/{model_name}")
        if not model_obj:
            model_obj = model_map[model_name]

        if "max_input_tokens" in model_obj:
            return model_obj["max_input_tokens"]

        if "max_tokens" in model_obj:
            return model_obj["max_tokens"]

        raise RuntimeError("No max tokens found for LLM")

On idea could be to have model type set in the UI (gpt-4-turbo) and leave the model name for the deployment. Using this model type you can now check for the image generation and also for the cost and use the name to query the endpoint.

danswer-ai / danswer

Support Azure GPT-4 Turbo endpoint for image generation and maximum number of tokens #1623