huggingface / evaluate

πŸ€— Evaluate: A library for easily evaluating machine learning models and datasets.
https://huggingface.co/docs/evaluate
Apache License 2.0
2.04k stars 258 forks source link
evaluation machine-learning



Build GitHub Documentation GitHub release Contributor Covenant

Tip: For more recent evaluation approaches, for example for evaluating LLMs, we recommend our newer and more actively maintained library LightEval.

πŸ€— Evaluate is a library that makes evaluating and comparing models and reporting their performance easier and more standardized.

It currently contains:

πŸŽ“ Documentation

πŸ”Ž Find a metric, comparison, measurement on the Hub

🌟 Add a new evaluation module

πŸ€— Evaluate also has lots of useful features like:

Installation

With pip

πŸ€— Evaluate can be installed from PyPi and has to be installed in a virtual environment (venv or conda for instance)

pip install evaluate

Usage

πŸ€— Evaluate's main methods are:

Adding a new evaluation module

First install the necessary dependencies to create a new metric with the following command:

pip install evaluate[template]

Then you can get started with the following command which will create a new folder for your metric and display the necessary steps:

evaluate-cli create "Awesome Metric"

See this step-by-step guide in the documentation for detailed instructions.

Credits

Thanks to @marella for letting us use the evaluate namespace on PyPi previously used by his library.