empirical-run / empirical

Test and evaluate LLMs and model configurations, across all the scenarios that matter for your application
https://docs.empirical.run
MIT License
141 stars 10 forks source link

feat: add support for grouping inputs #215

Closed saikatmitra91 closed 2 months ago

changeset-bot[bot] commented 2 months ago

🦋 Changeset detected

Latest commit: c48c9dee51f7856f1992fe1fbc519486809473cf

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 5 packages | Name | Type | | -------------------- | ----- | | @empiricalrun/scorer | Minor | | @empiricalrun/types | Minor | | @empiricalrun/cli | Minor | | web | Minor | | @empiricalrun/core | Patch |

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

github-actions[bot] commented 2 months ago

Empirical Run Summary

Run #5a93: gpt-3.5-turbo Run #e5d3: gpt-4-turbo-preview
Outputs 100% 100%
Scores
json-syntax 100% 100%
Avg latency 850ms 1800ms

Total dataset samples: 2