About the measurement metric

I have created a patch dataset, I used the hardnet model to train it from scratch. Strangely, the fpr95 is very high, it can drop from the initial value of 0.82 to 0.68. At the same time, the AP, auc metric is also very high, they can reach up to 0.84. So I do not interpret these metrics value. Additionly, I used the sklearn package to calculate the AP and auc. The code is following:

from sklearn.metrics import average_precision_score,roc_auc_score
def ErrorRateAt95Recall(labels, scores):
    distances = 1.0 / (scores + 1e-8)
    recall_point = 0.95
    labels = labels[np.argsort(distances)]
    # Sliding threshold: get first index where recall >= recall_point.
    # This is the index where the number of elements with label==1 below the threshold reaches a fraction of
    # 'recall_point' of the total number of elements with label==1.
    # (np.argmax returns the first occurrence of a '1' in a bool array).
    threshold_index = np.argmax(np.cumsum(labels) >= recall_point * np.sum(labels))

    FP = np.sum(labels[:threshold_index] == 0) # Below threshold (i.e., labelled positive), but should be negative
    TN = np.sum(labels[threshold_index:] == 0) # Above threshold (i.e., labelled negative), and should be negative
    # FP +TN denotes the number of Negative Samples.
    return float(FP) / float(FP + TN)
def Average_Precision_Score(lables,scores):
    precision=average_precision_score(lables,scores)
    return precision
def Roc_Auc_Score(labels,scores):
    return roc_auc_score(labels,scores)

Can you help me to interpret the problem? @ducha-aiki @DagnyT

DagnyT / hardnet

About the measurement metric #29