mozilla-ai / lm-buddy

Your buddy in the (L)LM space.
Apache License 2.0
63 stars 3 forks source link

New huggingface eval for the summarization use case with rouge, meteor, and bertscore #100

Closed aittalam closed 5 months ago

aittalam commented 5 months ago

What's changing

Added a new evaluate huggingface entrypoint which supports evaluation of local and remote models (seq2seq, causal, openai, vllm, llamafile) and loading datasets / saving results on s3.

How to test it

lm-buddy evaluate huggingface --config examples/configs/evaluation/hf_evaluate_config.yaml

Related Jira Ticket

https://mzai.atlassian.net/browse/MZPLATFORM-78

Additional notes for reviewers

I know we discussed messaging to mzai-platform's backend directly from lm-buddy jobs. I am 100% in favor of it, I just wanted to keep this PR independent from messaging and I will create a new one to be tested together with the updated mzai-platform code.