paidiver / paidiverpy

Apache License 2.0
0 stars 0 forks source link

Overlap removal based on feature matching #18

Open soutobias opened 1 month ago

soutobias commented 1 month ago

What:

Removal based on feature matching involves detecting and analyzing key features in images to identify and eliminate duplicates or overlapping images. This technique uses feature detection algorithms to find and compare key points across images, determining which images capture similar or overlapping scenes.

Why:

Removing duplicate or overlapping images is crucial for reducing redundancy and ensuring efficient use of resources. This process helps to avoid biases in ecological analyses and ensures a more accurate representation of the captured data. In datasets with a large number of images, manual detection of overlaps can be impractical and time-consuming. Automated feature matching allows for efficient and accurate identification of such overlaps.

How:

Feature matching is achieved through several steps:

  1. Feature Detection: Detect key points and features in images using algorithms like SIFT (Scale-Invariant Feature Transform), SURF (Speeded-Up Robust Features), or ORB (Oriented FAST and Rotated BRIEF). These algorithms identify distinctive points in each image that can be used for comparison.

  2. Feature Matching: Compare the detected features between image pairs. If a significant number of key points from one image match with another, it indicates that the images are capturing overlapping or similar scenes.

  3. Validation: Further validate these matches using methods such as RANSAC (Random Sample Consensus) to ensure that the matched features are consistent and reliable, thus confirming significant overlap.

Python Code Examples:

  1. Feature Detection and Matching Using SIFT:

    import cv2
    import numpy as np
    
    def detect_and_match_features(image1_path, image2_path):
        # Load images
        img1 = cv2.imread(image1_path, cv2.IMREAD_GRAYSCALE)
        img2 = cv2.imread(image2_path, cv2.IMREAD_GRAYSCALE)
    
        # Initialize SIFT detector
        sift = cv2.SIFT_create()
    
        # Detect keypoints and descriptors
        kp1, des1 = sift.detectAndCompute(img1, None)
        kp2, des2 = sift.detectAndCompute(img2, None)
    
        # Create BFMatcher object
        bf = cv2.BFMatcher(cv2.NORM_L2, crossCheck=True)
    
        # Match descriptors
        matches = bf.match(des1, des2)
    
        # Sort matches by distance
        matches = sorted(matches, key=lambda x: x.distance)
    
        # Draw matches
        img_matches = cv2.drawMatches(img1, kp1, img2, kp2, matches[:10], None, flags=cv2.DrawMatchesFlags_NOT_DRAW_SINGLE_POINTS)
    
        cv2.imshow('Feature Matches', img_matches)
        cv2.waitKey(0)
        cv2.destroyAllWindows()
    
    # Example usage
    detect_and_match_features('image1.jpg', 'image2.jpg')
  2. Feature Detection and Matching Using ORB:

    import cv2
    import numpy as np
    
    def detect_and_match_features_orb(image1_path, image2_path):
        # Load images
        img1 = cv2.imread(image1_path, cv2.IMREAD_GRAYSCALE)
        img2 = cv2.imread(image2_path, cv2.IMREAD_GRAYSCALE)
    
        # Initialize ORB detector
        orb = cv2.ORB_create()
    
        # Detect keypoints and descriptors
        kp1, des1 = orb.detectAndCompute(img1, None)
        kp2, des2 = orb.detectAndCompute(img2, None)
    
        # Create BFMatcher object
        bf = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck=True)
    
        # Match descriptors
        matches = bf.match(des1, des2)
    
        # Sort matches by distance
        matches = sorted(matches, key=lambda x: x.distance)
    
        # Draw matches
        img_matches = cv2.drawMatches(img1, kp1, img2, kp2, matches[:10], None, flags=cv2.DrawMatchesFlags_NOT_DRAW_SINGLE_POINTS)
    
        cv2.imshow('Feature Matches', img_matches)
        cv2.waitKey(0)
        cv2.destroyAllWindows()
    
    # Example usage
    detect_and_match_features_orb('image1.jpg', 'image2.jpg')

Choosing the pairing method can fasten image matching and reduce the likelihood of false positive. Below, we detail several methods implemented based on the COLMAP software capabilities (v.3.11).

What to expect:

The output will be a visual representation of matched features between image pairs, allowing for easy identification of overlapping scenes. This process generates a list of image pairs with significant feature overlap, which can then be reviewed to decide on the removal or consolidation of duplicate images.

What makes it difficult:

Success Metrics:

LoicVA commented 1 month ago

Ok that seems great. The only thing that will need some text on, it is the way images are matched together. I have added a paragraph on that.