General Question about the Evaluator LLM

braintrustdata / autoevals

AutoEvals is a tool for quickly and easily evaluating AI model outputs using best practices.

MIT License

199 stars 17 forks source link

General Question about the Evaluator LLM #51

Open lalehsg opened 9 months ago

lalehsg commented 9 months ago

Hi! I am wondering if it's possible to use open source or self-deployed llms (and not only open ai) as the judge or evaluator? If yes, could you please refer to an example or part of the docs explaining that? Thanks!

ankrgyl commented 9 months ago

Hi there! The best way to use a 3rd party LLM, today, is to provide a base_url or set OPENAI_BASE_URL to point to a proxy that supports other LLM providers, through the OpenAI protocol.

In fact, autoevals by default uses the Braintrust Proxy, which supports a very wide variety of LLMs and providers. If you sign up for abraintrust account, you can plug in API keys for various inference providers, like Together, and access a wide variety of other models that way.

If you have other ideas in mind, we are welcome to contributions!

mongodben commented 3 months ago

If you have other ideas in mind, we are welcome to contributions!

I think it'd be nice if you could pass an LLM client as a property to the evaluators, rather than just the OpenAI URL and API key. this would give folks more control over where the evaluator interface meets the LLM interface, and allow for things like mocking. perhaps you could allow users to pass an object that implements the OpenAI client interface, rather than pass the URL and API key directly.

ankrgyl commented 3 months ago

We're open to contributions!