kboyd / Roc

Everything ROC and Precision-Recall curves.
BSD 2-Clause "Simplified" License
23 stars 7 forks source link

All About Roc

Description

Roc is software for generating and working with ROC (Receiver Operating Characteristic) and PR (Precision-Recall) curves. These curves are typically used to evaluate classification approaches in areas such as Machine Learning, Statistics, Medicine, and Epidemiology. Students, scientists, and researchers are the target audience of this software.

The goal of this project is to provide software for evaluating ROC and PR curves that correctly implements the traditional and recent approaches using languages allowing for flexibility in each investigator's environment.

Roc, the name of the software, is pronounced "rock" like its namesake, roc, an enormous, legendary bird of prey.

Quick Links

Downloads

License

Roc is free, open source software. It is released under the BSD 2-Clause License (also known as the FreeBSD License). See the file LICENSE.txt in your distribution (or on GitHub) for details.

Features and Project Maturity

Roc is version 0.1.0.

This software is still in the alpha stages of design and development. However, it is mature enough for the authors to use it as part of their everyday workflow.

It is released as a library and as a command-line interface (CLI) front-end for the library. The table below contains a summary of features.

Features are S=stable, T=tested, I=implemented, P=planned, NP=not planned, ?=undecided, NA=not applicable. Languages are J=Java, P2=Python 2.x.

Feature Description           Library Status  CLI Status
-------------------           --------------  ----------
ROC curves
. Points                      J:T  P2:P       J:T  P2:P
. Area                        J:T  P2:P       J:T  P2:P
. Maximum area (convex hull)  J:T  P2:P       J:P  P2:P
. Aggregation (averaging)     J:P  P2:P       J:?  P2:?
. Confidence bounds           J:P  P2:P       J:?  P2:?
. Clipping                    J:P  P2:P       J:?  P2:?
PR curves
. Points                      J:T  P2:P       J:T  P2:P
. Area                        J:T  P2:P       J:T  P2:P
. Maximum area (convex hull)  J:I  P2:P       J:P  P2:P
. Aggregation (averaging)     J:P  P2:P       J:?  P2:?
. Confidence bounds           J:P  P2:P       J:?  P2:?
. Clipping                    J:P  P2:P       J:?  P2:?
. Minimum awareness           J:P  P2:P       J:?  P2:?
Plotting                      J:NP P2:P       J:NP P2:P
Inputs
. Ranking of labels           J:T  P2:P       J:T  P2:P
. Scores, labels              J:T  P2:P       J:T  P2:P
. Score-label pairs           J:P  P2:P       J:T  P2:P
. Example weights             J:P  P2:P       J:P  P2:P
Convenience
. File I/O                    J:P  P2:P       NA
Ranking Statistics
. Mann-Whitney-U              J:T  P2:P       J:?  P2:?

This software is designed and tested to support 1 million total examples. It probably works on many more, but the performance and accuracy have not been tested at such larger scales.

Requirements

Development Requirements

If you want to develop this software, there are some additional requirements.

Note that certain JUnit versions contain some Hamcrest classes and so may conflict with (override) those from Hamcrest. If you encounter missing Hamcrest symbols, try placing Hamcrest ahead of JUnit on the class path or updating JUnit.

Java Library, JAR, CLI

The Java library provides an API for working with ROC and PR curves in your Java programs. It is distributed as a Java archive (JAR) containing source code, bytecode, and documentation. The JAR can be obtained on the releases page. To include the library in your Java project, just place the JAR in a convenient location and include it in your classpath. You can browse the documentation by extracting it from the JAR or by viewing the latest version on GitHub.

The JAR also contains the command-line interface which can be run like this:

java -jar roc-0.1.0.jar --help

Contact

Please search the existing documentation before contacting us. There is this README, the Javadoc, the wiki, and existing issues. Then, open an issue to report a bug or ask a question.

Copyright (c) 2014 Roc Project. This is free software. See LICENSE.txt for details.