mozilla-ai / lm-buddy

Your buddy in the (L)LM space.
Apache License 2.0
64 stars 3 forks source link

[prometheus] improve logging capabilities #81

Closed aittalam closed 1 month ago

aittalam commented 7 months ago

The current version of the Prometheus entrypoint mimics kaistai's eval and saves eval outputs in a json file together with the input data for easier comparison (so you have e.g. all questions, model responses, and GPT4 + prometheus scores ready to be compared).

This can be improved, e.g.:

aittalam commented 1 month ago

Closed because not relevant anymore (might pick it up again if we decide to add LLM-as-judge again into our evals but it will likely be part of a larger effort)