How to measure accuracy in %age?

ma-siddiqui commented 1 year ago

Based on the results after matching two faces, how can I calculate the matching accuracy out of 100?

Thanks,

leonid-leshukov commented 1 year ago

Hi @ma-siddiqui,

Thank you for your question.

The library provides a function to calculate the distance between two face templates, distance is a float value more or equal to 0. The library uses distance thresholds to decide whether are faces the same or not (the default threshold is 1.306). Let's say that threshold in percentage should be 85%. Based on this we can use interpolation of distance for percentage calculation (see example below).

...
template<typename T>
T Lagrange(T X)
{
    static const int n = 2;
    static T y[n] = {100, 85};
    static T x[n] = {0.0, 1.306};
    T L, l;
    int i, j;
    L = 0;
    for (i = 0; i < n; ++i)
    {
        l = 1;
        for (j = 0; j < n; ++j)
            if (i != j)
                l *= (X - x[j]) / (x[i] - x[j]);
        L += y[i] * l;
    }

    return return std::max(L, 0.0);;
}

int main (int argc, char *argv[])
{
    ...
    double distance = matcherData["verification"]["result"]["distance"].getDouble();
    double result_persent = Lagrange(distance);
    ...
}

Let us know if you want to know details about this.

ma-siddiqui commented 1 year ago

Thank you for answering my question. May I know, is this a standard way of calculating similarity matching accuracy in %age? Or this method is only valid for 3DiVi face SDK?

leonid-leshukov commented 1 year ago

@ma-siddiqui There is no standard method to calculate face similarity percentage because the basis of this is vectors and the distance between two vectors. Distance between similar faces is less and between different faces is more. In our case, we define that percentage of similarity for the same vectors is 100, and the percentage of similarity for vectors for the different photos of one face is 85. To estimate intermediate points we use interpolation.

Additional comment from our AI team:

Each trained neural network (even on the same dataset) has a different distance distribution for pairs of impostor (negative) and genuine (positive) (see example in the figure). It follows that we need to compute roc-curve for each NN and take a relative threshold. And also this is the reason why we cannot just normalize distance 0...1 (as we know squared Euclidean distance is distributed in range [0, 4])

Feel free to ask for additional details.

ma-siddiqui commented 1 year ago

Thank you. Its great explanation.

3DiVi / open-source-face-sdk

How to measure accuracy in %age? #1