deepset-ai / haystack

:mag: LLM orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.
https://haystack.deepset.ai
Apache License 2.0
14.58k stars 1.71k forks source link

Add support for `AzureOpenAI` in `LLMEvaluator` #7946

Open EdoardoAbatiTR opened 1 week ago

EdoardoAbatiTR commented 1 week ago

Is your feature request related to a problem? Please describe.

LLMEvaluator currently only supports OpenAI, it would be nice if we could use it with the OpenAI models via Azure too.

Describe the solution you'd like

I'd like to use evaluators with Azure OpenAI (e.g. ContextRelevanceEvaluator(api='azure-openai'))

In addition, I propose to slightly change the design of LLMEvaluator to allow more flexibility. Currently the params api_key=Secret.from_env_var("OPENAI_API_KEY") forces the user to provide an env var that is specific to OpenAI, and would not be used by other generators.

What about having something like:

@component
class LLMEvaluator:
    def __init__(
        self,
        instructions: str,
        ...
        api: str = "openai",
        generator_kwargs: Dict[str, Any] = ..., # instead of api_key
    ):
        ...
        self.generator = OpenAIGenerator(**generator_kwargs)

?

This wouldn't force the user to provide to LLMEvaluator anything specific to the generator. It gives the flexibility to pass anything that the generator can take (e.g. api keys, api version, or azure_deployment in case of Azure) via the generator_kwargs. At the same time, if the user doesn't pass anything, the generator would still look for its required env vars during instantiation.

I guess api_key needs to enter the deprecation cycle before being removed. Maybe we could just change to api_key=Secret.from_env_var("OPENAI_API_KEY", strict=False) until deprecated, so that that var will not be required for other generators.

Describe alternatives you've considered

Subclassing the LLMEvaluator (and all the child classes ) into a custom component

Additional context

Happy to hear your thoughts, also in case there are other better solutions I didn't consider. :)

I'm currently a bit busy with other things, but I may be able to raise PR with the proposal in the next days.

lbux commented 1 week ago

I solved this in my PR for local evaluation support but decided to not proceed with the PR: https://github.com/deepset-ai/haystack/pull/7745

You can take what I built, strip the llama.cpp bits, and keep the generation_kwargs sections.

lbux commented 3 days ago

So after looking more into it, Azure has their own AzureOpenAI class in the openai package. While some other services use the openai api and allow us to redirect to their api (local or hosted), that doesn't seem to be possible for Azure using base_url anymore: https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/migration

However, I was able to extend the LLMEvaluators to support parameters for OpenAI and in the future if any other providers are added, it would work for them as well.