mozilla / PRESC

Performance Robustness Evaluation for Statistical Classifiers
Mozilla Public License 2.0
37 stars 51 forks source link

PRESC: Performance and Robustness Evaluation for Statistical Classifiers

CircleCI Join the chat at https://gitter.im/PRESC-outreachy/community

PRESC is a toolkit for the evaluation of machine learning classification models. Its goal is to provide insights into model performance which extend beyond standard scalar accuracy-based measures and into areas which tend to be underexplored in application, including:

More details about the specific features we are considering are presented in the project roadmap. We believe that these evaluations are essential for developing confidence in the selection and tuning of machine learning models intended to address user needs, and are important prerequisites towards building trustworthy AI.

It also includes a package to carry out copies of machine learning classifiers.

As a tool, PRESC is intended for use by ML engineers to assist in the development and updating of models. It is usable in the following ways:

A further goal is to use PRESC:

For the time being, the following are considered out of scope:

There is a considerable body of recent academic research addressing these topics, as well as a number of open-source projects solving related problems. Where possible, we plan to offer integration with existing tools which align with our vision and goals.

Documentation

Project documentation is available here and provides much more detail, including:

Examples

An example script demonstrating how to run a report is available here.

There are a number of notebooks and explorations in the examples/ dir, but they are not guaranteed to run or be up-to-date as the package has undergone major changes recently and we have not yet finished updating these.

Some well-known datasets are provided in CSV format in the datasets/ dir for exploration purposes.

Notes for contributors

Contributions are welcome. We are using the repo issues to manage project tasks in alignment with the roadmap, as well as hosting discussions. You can also reach out on Gitter.

We recommend that submissions for new feature implementations include a Juypter notebook demonstrating their application to a real-world dataset and model.

This repo adheres to Python black formatting, which is enforced by a pre-commit hook (see below).

Along with code contributions, we welcome general feedback:

The development of the ML Classifier Copies package is being carried out in the branch model-copying.

Setting up a dev environment

Make sure you have conda (eg. Miniconda) installed. conda init should be run during installation to set the PATH properly.

Set up and activate the environment. This will also enable a pre-commit hook to verify that code conforms to flake8 and black formatting rules. On Windows, these commands should be run from the Anaconda command prompt.

$ conda env create -f environment.yml
$ conda activate presc
$ python -m pip install -e .
$ pre-commit install

To run tests:

$ pytest

Acknowledgements

This project is maintained by Mozilla's Data Science team. We have also received code contributions by participants in the following programs, and we are grateful for their support:

The ML Classifier Copying package is being funded through the NGI0 Discovery Fund, a fund established by NLnet with financial support from the European Commission's Next Generation Internet programme, under the aegis of DG Communications Networks, Content and Technology under grant agreement No 825322.