Open simonw opened 2 months ago
Just running an eval from a YAML file (or URL to a YAML file) will save a copy of that eval in the database, so anything you've run once you can run again using just the database that it saved its results to.
This will also help with running evals over time, e.g. to see if the API version of a model gets different results compared to a few months ago.
Could be something like this:
Then later:
Without the
-d
option the default SQLite database for LLM would be used.