Open lukauskas opened 6 years ago
It is not easy to use orig_points
to retrieve the corresponding scores, but you can usemode = "basic"
for that purpose. For instance, the following snippet shows how to get the original scores when precision is greater than or equal to 0.75.
library("precrec")
# Dataset with 10 positives and 10 negatives
data(P10N10)
# Calculate basic evaluation measures
sspoints <- evalmod(mode = "basic", scores = P10N10$scores, labels = P10N10$labels)
# Convert sspoints to data.frame
df <- data.frame(sspoints)
# Get normalized threshold values for precision >= 0.75
xs <- df[df$type == "precision" & df$y >= 0.75, "x"]
# Show scores and precision values corresponding to xs
df[df$x %in% xs & df$type %in% c("score", "precision"), ]
In the data frame of the example above, the x
column contains the normalized threshold values with range [0, 1], and the y
column contains the values specified in the type
column.
Unlike ROC, precision-recall curves are not monotonically increasing so that you may need to add one more condition, such as 'recall is greater than 0.5', for some cases.
Currently the PRC curve returns essentially a DataFrame with three columns:
x
,y,
and a boolean columnorig_points
.Is it possible to somehow map the non-interpolated points (
orig_points
= 1) to the actual score thresholds for the resulting precision/recall measurements? Somewhat among the lines of howsklearn
handles it. This is sometimes needed to ask questions, like 'what is the minimum threshold at which precision is >= 75%?' or similar.I assume a sorted increasing list of unique scores should map 1:1 to the
orig_points
, but this seems a bit hacky. Maybe there is a way to get it out ofprecrec
directly?