braintrustdata / autoevals

AutoEvals is a tool for quickly and easily evaluating AI model outputs using best practices.
MIT License
199 stars 17 forks source link

Add customizable embedding model to `AnswerRelevancy` metric #94

Closed mongodben closed 1 month ago

mongodben commented 1 month ago

Currently, the embedding model used by the Ragas AnswerRelevancy metric isn't configurable. It defaults to the default of EmbeddingSimilarity, which is currently "text-embedding-ada-002".

This PR makes the model configurable.

This is useful for all users who don't want to use the model "text-embedding-ada-002" or for Azure OpenAI users who use that model but give it a different name (which is configurable in the Azure OpenAI service)

mongodben commented 1 month ago

@ankrgyl back to you

github-actions[bot] commented 1 month ago

Braintrust eval report

Autoevals (main-1725908744) Score Average Improvements Regressions
NumericDiff 73.5% (0pp) 5 🟢 7 🔴
Duration 2.53s (+1s) - 100 🔴
Llm_duration 1.96s - -
Prompt_tokens 279.25s (+0s) - -
Completion_tokens 16.9s (+0.04s) 13 🟢 14 🔴
Total_tokens 296.14s (+0.04s) 13 🟢 14 🔴
Estimated_cost 0$ - -