opencv / opencv_contrib

Repository for OpenCV's extra modules
Apache License 2.0
9.43k stars 5.76k forks source link

Linemod may miss valid template matches #1522

Open Chungzuwalla opened 6 years ago

Chungzuwalla commented 6 years ago
System information (version)

There is a bug in the linemod implementation which causes linemod to miss some valid template matches. The bug may even cause a newly created template to not match the source images it was created from.

The culprit is the code block at line 1295 in similarityLocal():

    // Discard feature if out of bounds, possibly due to applying the offset
    if (f.x < 0 || f.y < 0 || f.x >= size.width || f.y >= size.height)
      continue;

similarityLocal() takes a template match found at a coarse resolution, and verifies and localizes the match using a finer resolution. It does this by scanning for matches within a 16x16 subwindow centred on the coarse match position. The initial position for scanning is thus 8 blocks above and 8 blocks to the left of the coarse match position. If the coarse match position is within 8 blocks (8T pixels) of the top or left edge of the source images, the initial position for a feature may be outside the source images. In this case, as the code above shows, the feature is discarded and not scanned across the subwindow at all. That is a problem, because during scanning the feature would move into the source images and should make a contribution to the similarity image within the overlap region. The values in the similarity image then fail to reach the threshold for detection, so template matches are missed.

To fix this, features that only partially overlap the source images must still be evaluated within the overlap region. This means adding just a subrectangle of the 16x16 subwindow onto the similarity image. Since the speed of linemod relies on using SSE to process 16 blocks (a whole row of the 16x16 subwindow) at a time, there is a potential to lose speed, so the solution must be considered carefully. Possible solutions include:

On my machine (Xeon W3670 CPU) I can't measure a timing difference between aligned and unaligned 16-byte SSE instructions. But as the saying goes, we aren't shipping my machine. This probably needs attention from someone with good SSE experience.

Steps to reproduce

Code which matches each template against the depth map it was created from. In my case, only 355 of the 450 templates created match their own source 100%.

This code also does some timing tests, since any bugfix will inevitably have to be performance tested against the current implementation.

#include <stdio.h>
#include <time.h>
#include <opencv2/core.hpp>
#include <opencv2/core/affine.hpp>
#include <opencv2/imgproc.hpp>
#include <opencv2/highgui.hpp>
#include <opencv2/rgbd/linemod.hpp>

#define BACKGROUND  500     // Depth map background value
#define DISTANCE    225     // Distance to model

// Quick & dirty way to create a very poor & non-uniform sampling over part of SO(3).
std::vector<cv::Affine3f> fibonacci(int n_axes, int n_angles)
{
    const float PI = 3.14159265359f;
    const float full_circle = 2.0 * PI;
    const float increment = PI * (3.0f - sqrt(5.0f));
    const float offset = 2.0f / n_axes;
    const int half_n_axes = n_axes / 2;
    cv::Vec3f translate(0.0, 0.0, DISTANCE);
    std::vector<cv::Affine3f> poses;
    for (int i = 0; i < half_n_axes; i++)
    {
        cv::Affine3f pose;
        cv::Vec3f axis;
        axis(1) = ((i * offset) - 1) + (offset / 2);
        float phi = ((i + half_n_axes) % n_axes) * increment;
        float r = sqrt(1 - axis(1) * axis(1));
        axis(0) = cos(phi) * r;
        axis(2) = sin(phi) * r;
        for (int j = 1; j < n_angles; j++)
        {
            pose.rotation(axis * (full_circle * j / n_angles));
            pose.translation(translate);
            poses.push_back(pose);
        }
    }
    return poses;
}

// Quick & dirty way to draw a fairly dense point cloud on a depth map.
void draw(std::vector<cv::Vec3f> const &verts, cv::Affine3f const &pose, cv::Matx33f proj, cv::Mat &depth_map)
{
    depth_map.setTo(BACKGROUND);
    for (auto &vert : verts)
    {
        cv::Point3f pt = proj * (pose * vert);
        int x = (int)((pt.x / pt.z) + 0.5);
        int y = (int)((pt.y / pt.z) + 0.5);
        if (depth_map.at<int16_t>(y, x) > (int16_t)pt.z)
            depth_map.at<int16_t>(y, x) = (int16_t)pt.z;
    }
    cv::imshow("Depth map", depth_map * (65000 / BACKGROUND));
    cv::waitKey(1);
}

int main()
{
    // Quick & dirty way to read the vertex data from Armadillo.ply.
    // Get Armadillo.ply.gz from http://graphics.stanford.edu/data/3Dscanrep/
    std::vector<cv::Vec3f> verts(172974);
    FILE *fp = fopen("Armadillo.ply", "rb");
    fseek(fp, 264, SEEK_SET);
    fread(&verts[0], 12 * verts.size(), 1, fp);
    fclose(fp);

    // Big endian to little endian.
    char *data = (char*)&verts[0];
    for (size_t i = 0; i < 12 * verts.size(); i += 4)
    {
        std::swap(data[i+0], data[i+3]);
        std::swap(data[i+1], data[i+2]);
    }

    // Move model to origin
    cv::Vec3f centroid(0,0,0);
    for (auto &vert : verts)
        centroid += vert;
    centroid /= (float)verts.size();
    for (auto &vert : verts)
        vert -= centroid;

    cv::Mat depth_map(240, 320, CV_16U);

    float focal_length = (depth_map.rows / 2) / tan(0.5f);
    cv::Matx33f proj(focal_length, 0, depth_map.cols / 2.0f, 0, focal_length, depth_map.rows / 2.0f, 0, 0, 1);

    std::vector<cv::Ptr<cv::linemod::Modality>> modalities;
    modalities.push_back(cv::makePtr<cv::linemod::DepthNormal>());
    std::vector<int> T_pyramid { 5, 8 }; // This is the default for T_pyramid
    cv::linemod::Detector detector(modalities, T_pyramid);

    std::vector<cv::Mat> sources(1, depth_map);
    std::vector<cv::linemod::Match> matches;
    std::vector<cv::Affine3f> poses = fibonacci(100, 10);
    int correct = 0;
    for (size_t p = 0; p < poses.size(); p++)
    {
        printf("Adding template for pose %d / %d...\n", p, poses.size());
        draw(verts, poses[p], proj, depth_map);
        int template_id = detector.addTemplate(sources, "", depth_map != BACKGROUND);
        CV_Assert(template_id == p); // Assert if template was not added
        // Check if we found match for just-added template, and report similarity
        detector.match(sources, 50.0f, matches);
        int best = -1;
        for (size_t m = 0; m < matches.size(); m++)
            if (matches[m].template_id == template_id && (best == -1 || matches[m].similarity > matches[best].similarity))
                best = m;
        if (best >= 0 && matches[best].similarity == 100.0f)
            correct++;
        else if (best >= 0)
            printf(" Template %d matched its own sources to only %.1f%%!\n", template_id, matches[best].similarity);
        else
            printf(" Template %d did not match its own sources!\n", template_id);
    }
    printf("%d of %d templates matched their own sources 100%%\n", correct, detector.numTemplates());

    // Timing evaluation
    poses = fibonacci(200, 15); // Use a different set of poses for testing
    float total_seconds = 0.0f;
    for (size_t p = 0; p < poses.size(); p++)
    {
        draw(verts, poses[p], proj, depth_map);
        // Time only the call to detector.match()
        clock_t t0 = clock();
        detector.match(sources, 90.0f, matches);
        clock_t t1 = clock();
        float seconds = (float)(t1 - t0) / CLOCKS_PER_SEC;
        printf(" Pose %d / %d : %d matches found in %.5g seconds\n", p, poses.size(), matches.size(), seconds);
        total_seconds += seconds;
    }
    printf(" Average time in detector.match() is %.5g seconds\n", total_seconds / poses.size());
    getchar();
}
Chungzuwalla commented 6 years ago

Attached Armadillo.ply, which the repro code needs. It is from the Stanford 3D Scanning Repository at http://graphics.stanford.edu/data/3Dscanrep/. Their page states "Please be sure to acknowledge the source of the data and models you take from this repository. You are welcome to use the data and models for research purposes. You are also welcome to mirror or redistribute them for free."

Armadillo.ply.gz