chore: nicer message string for sql-syntax scorer

empirical-run / empirical

Test and evaluate LLMs and model configurations, across all the scenarios that matter for your application

https://docs.empirical.run

MIT License

143 stars 11 forks source link

Closed arjunattam closed 3 months ago

arjunattam commented 3 months ago

drive-by: spider example changes to avoid rate limited logs

changeset-bot[bot] commented 3 months ago

Latest commit: 01e5402939139375dc017f9f16801250dbd7958e

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 3 packages

| Name | Type | | -------------------- | ----- | | @empiricalrun/scorer | Patch | | @empiricalrun/core | Patch | | @empiricalrun/cli | Patch |

github-actions[bot] commented 3 months ago

Stats	Run #b295: gpt-3.5-turbo	Run #9338: gpt-4-turbo-preview
outputs	100%	100%
is-json	100%	100%

Total dataset samples: 2