danswer-ai / danswer

Gen-AI Chat for Teams - Think ChatGPT if it had access to your team's unique knowledge.
https://docs.danswer.dev/
Other
9.77k stars 1.09k forks source link

Support Azure GPT-4 Turbo endpoint for image generation and maximum number of tokens #1623

Open plopezamaya opened 3 weeks ago

plopezamaya commented 3 weeks ago

Image generation is supported only for openai providers and model names equal to gpt-4-turbo.

Some examples of this can be seen in the v3.0.79 :

-backend/danswer/chat/process_message.py:

elif tool_cls.__name__ == ImageGenerationTool.__name__:
                dalle_key = None
                if llm and llm.config.api_key and llm.config.model_provider == "openai":
                    dalle_key = llm.config.api_key
                else:
                    llm_providers = fetch_existing_llm_providers(db_session)
                    openai_provider = next(
                        iter(
                            [
                                llm_provider
                                for llm_provider in llm_providers
                                if llm_provider.provider == "openai"
                            ]
                        ),
                        None,
                    )
                    if not openai_provider or not openai_provider.api_key:
                        raise ValueError(
                            "Image generation tool requires an OpenAI API key"
                        )
                    dalle_key = openai_provider.api_key
                tools.append(ImageGenerationTool(api_key=dalle_key))
function checkLLMSupportsImageGeneration(provider: string, model: string) {
  return provider === "openai" && model === "gpt-4-turbo";
}

Azure deployments e.i model is not always gpt-4-turbo as it can follow some deployment conventions. Therefore it will not be recognized neither for the Image Generation nor for the maximum number of tokens using litellm.model_cost function :

def get_llm_max_tokens(
    model_map: dict,
    model_name: str,
    model_provider: str = GEN_AI_MODEL_PROVIDER,
) -> int:
    """Best effort attempt to get the max tokens for the LLM"""
    if GEN_AI_MAX_TOKENS:
        # This is an override, so always return this
        return GEN_AI_MAX_TOKENS

    try:
        model_obj = model_map.get(f"{model_provider}/{model_name}")
        if not model_obj:
            model_obj = model_map[model_name]

        if "max_input_tokens" in model_obj:
            return model_obj["max_input_tokens"]

        if "max_tokens" in model_obj:
            return model_obj["max_tokens"]

        raise RuntimeError("No max tokens found for LLM")

On idea could be to have model type set in the UI (gpt-4-turbo) and leave the model name for the deployment. Using this model type you can now check for the image generation and also for the cost and use the name to query the endpoint.

plopezamaya commented 1 week ago

Also note that when deploying models into azure the model name will not always be gpt-4-turbo-2024-04-09 or one of the litellm model names. Meaning that for azure providers or custom providers there should be a case Model Base Name Making a deployment with the nameprd-myprojectapi-gpt35-turbo-eu-west-3 on azure have the following properties :

This will allow to use the azure deployments/endpoints while using the dynamic maximum number of tokens for the base model and not having a single maximum number of tokens to be set in GEN_AI_MAX_TOKENS