kboyd / Roc

Everything ROC and Precision-Recall curves.
BSD 2-Clause "Simplified" License
23 stars 7 forks source link

Points for plotting versus all points #23

Open afbarnard opened 10 years ago

afbarnard commented 10 years ago

I think there should be two methods for each type of curve, one for computing all the points and one for computing the points useful for plotting. In both ROC and PR space, many points are collinear, so the difference would be that "plotting" points would only include those that are at a corner (that is, intended for plotting with line segments). The idea is then that "plotting" points would be the default but that there would still be a way to get all the points. I propose that rocPoints be renamed to allRocPoints and the "plotting" version be named rocPoints with analogous changes to the PR methods.

kboyd commented 10 years ago

That sounds good for rocPoints. Everything is a bit fuzzier for prPoints because of the interpolation. For now, continuing to use the lower trapezoid estimator for PR is probably fine, so we can do the same as ROC and probably resuse the collinear detection.

afbarnard commented 10 years ago

Yes, the points depend on the estimator for both PR and ROC. Maybe we can think a little bit about what it would be like to have points based on various estimators, but for now I see no harm in going ahead with the current estimators (rectangle/trapezoid/MWU for ROC and lower trapezoid for PR). Indeed, with the design of having a settable member parameter to control the estimator for each of ROC and PR, rocPoints and prPoints function the same with any estimator ("give me a good set of points for plotting with line segments") and all*Points would always give the empirical estimator, that is, points at all classification thresholds.