serengil / deepface

A Lightweight Face Recognition and Facial Attribute Analysis (Age, Gender, Emotion and Race) Library for Python
https://www.youtube.com/watch?v=WnUVYQP4h44&list=PLsS_1RYmYQQFdWqxQggXHynP1rqaYXv_E&index=1
MIT License
13.81k stars 2.15k forks source link

Add Pearson and Jaccard distances for distance metrics #1065

Closed serengil closed 7 months ago

serengil commented 7 months ago
def pearson_distance(vector1, vector2):
    # Ensure vectors have the same length
    assert len(vector1) == len(vector2), "Vectors must have the same length."

    # Calculate means
    mean1 = np.mean(vector1)
    mean2 = np.mean(vector2)

    # Calculate numerator and denominators for Pearson correlation
    numerator = sum((x - mean1) * (y - mean2) for x, y in zip(vector1, vector2))
    denominator1 = sum((x - mean1)**2 for x in vector1)
    denominator2 = sum((y - mean2)**2 for y in vector2)

    # Calculate Pearson correlation coefficient
    correlation_coefficient = numerator / np.sqrt(denominator1 * denominator2)

    # Calculate Pearson distance
    pearson_distance = 1 - correlation_coefficient
    return pearson_distance

def jaccard_distance(vector1, vector2):
    # Ensure vectors have the same length
    assert len(vector1) == len(vector2), "Vectors must have the same length."

    # Convert vectors to sets
    set1 = set(vector1)
    set2 = set(vector2)

    # Calculate Jaccard distance
    jaccard_distance = 1 - len(set1.intersection(set2)) / len(set1.union(set2))
    return jaccard_distance
serengil commented 7 months ago

Pearson and Jaccard distances are typically not used directly in facial recognition systems. Facial recognition systems commonly employ techniques from computer vision and machine learning to analyze and compare facial features. However, I can provide some information about these distance measures:

Pearson Correlation Coefficient:

The Pearson correlation coefficient measures the linear relationship between two sets of data. It ranges from -1 to 1, where -1 indicates a perfect negative correlation, 1 indicates a perfect positive correlation, and 0 indicates no correlation. While Pearson correlation can be useful in certain similarity measures, it may not be the best choice for facial recognition because it focuses on linear relationships and may not capture the complex non-linear relationships present in facial features.

Jaccard Distance:

The Jaccard distance is commonly used for comparing the similarity between two sets. It is defined as the size of the intersection divided by the size of the union of the sets. While Jaccard distance has applications in measuring set similarity, it is not typically used for comparing facial features in the context of facial recognition. Facial recognition systems often involve more detailed feature extraction and analysis. In facial recognition, more common approaches involve the use of deep learning techniques, such as Convolutional Neural Networks (CNNs), to extract and compare facial features. These models are trained on large datasets to learn the representations that are effective for facial recognition tasks. Additionally, distance measures like Euclidean distance or cosine similarity are often used to compare feature vectors obtained from these deep learning models.

In summary, while Pearson and Jaccard distances have their applications in other domains, they are not commonly used as direct metrics for facial recognition. Facial recognition systems typically rely on more sophisticated methods, particularly those rooted in computer vision and deep learning.