TonicAI / tonic_validate

Metrics to evaluate the quality of responses of your Retrieval Augmented Generation (RAG) applications.
https://docs.tonic.ai/validate/
MIT License
230 stars 25 forks source link

Request for OpenAI Assistant Examples #91

Closed williamzebrowskI closed 3 months ago

williamzebrowskI commented 3 months ago

This is not an issue but a request for further examples associated with testing OpenAI Assistants.

Currently, there is one example which compares a CustomGPT vs Assistant but I was looking for more if possible. The example shows the use of AnswerSimilarityMetric and curious if there are anymore examples you can share on Validating an Assistants performance using other metrics.

Thanks!

akamor commented 3 months ago

Hey @williamzebrowskI . Thanks for filing. What other metrics would you like to use? Our README lists 6 metrics (including the AnswerSimilarityMetric) and in addition to those we have ~15 more.

Go here for the readme.

And from the root of the repo go to tonic_validate/metrics to see additional metrics. Most of the other metrics are self explanatory but I'm happy to provide any clarifications if you need.

When you know which metrics you wish to use just pass them in as a list to the ValidateScorer constructor, e.g.:

from tonic_validate import ValidateScorer
from tonic_validate.metrics import AnswerConsistencyMetric, AnswerSimilarityMetric

scorer = ValidateScorer([
    AnswerConsistencyMetric(),
    AugmentationAccuracyMetric()
], model_evaluator="gpt-3.5-turbo")
akamor commented 3 months ago

If all of this makes sense and you are good just let me know so we can close out the issue.

akamor commented 3 months ago

Closing for now but happy to re-open if you still have questions @williamzebrowskI