Celebrandil / CudaSift

A CUDA implementation of SIFT for NVidia GPUs (1.2 ms on a GTX 1060)
MIT License
843 stars 281 forks source link

descriptor of keypoint seems not correct #41

Open fortuneko opened 5 years ago

fortuneko commented 5 years ago

Hi, I use two images to test feature matching, the matched result seems wrong, the number of matched result is far away from the result of opencv version. the two images are below, get 0 matches from CudaSift/mainSift.cpp,but got more than 100 matches from opencv version

cslayers commented 5 years ago

There are differences between this two SIFT implementations. So the descriptors generated by those programs are different too. It's hard to tell which is correct or not.

fortuneko commented 5 years ago

though there are differences between the two implementations, their functionality should reasonably be similar, but after test with the two images above found that their performance varies too much,this makes no sense if this implementation is FINE

cslayers commented 5 years ago

SIFT有很多参数,您是否注意到了并将它们设置成一致?

fortuneko commented 5 years ago

@colinlin1982 @cslayers thanks,the parameter you provided is almost same as in the mainSift.cpp. I found a strange thing that when i use the images i uploaded above to run mainSift.cpp, and i got 1648 and 2273 keypoints of each image, and 452 final matches

But when i use the original images to run mainSift i still got bad result, 2166 and 477 keypoints and 0 final match, but opencv version is ok with this input. I have upload the original images to the dropbox: https://www.dropbox.com/s/nurqwg0rm9b0634/x_046.jpg
https://www.dropbox.com/s/iz0ofy37ycdkqob/y_045.jpg

open the link above and click download button in up-right page to download the original image. Because the original image size to big i resize it to smaller size(1024) use the code below:

cv::Mat read_image(std::string strPath,int max_size=1024){
    cv::Mat img = cv::imread(strPath);

    std::cout<<"original size:"<<img.cols<<","<<img.rows<<std::endl;

    int w = img.cols;
    int h = img.rows;
    int max_wh = cv::max(w,h);
    if (max_wh > max_size){
        int sw = w*max_size/max_wh;
        int sh = h*max_size/max_wh;
        cv::resize(img,img,cv::Size(sw,sh));
    }
    std::cout<<"resized size:"<<img.cols<<","<<img.rows<<std::endl;
    return img;
}
colinlin1982 commented 5 years ago

@fortuneko dropbox is banned in China. Maybe You can write the result image to disk and see whether it's wraped.

fortuneko commented 5 years ago

@colinlin1982 we are both in china ~~, you need a vpn to travel through the GFW. What you mean wheather it's wrapped?

Celebrandil commented 5 years ago

I will take a look at it. I have one direct comment. Lowe uses an ambiguity measure for matching to remove features that have more than one good match in the other image. However, since repetitive pattens often exist in man-made environments, in particular in printed material, you might run into situations where almost all featured are ignored and the following model (e.g. homography) fitting collapses. It's a fundamental problem that is not easily solved. One could try to ignore the ambiguity score for matching and increase the number of RANSAC trials for model fitting.

Celebrandil commented 5 years ago

I think I found the problem. First you need to read the image in grayscale. Add 0 as an argument to imread, i.e. cv::imread(strPath, 0). The code doesn't recognise that pixels are in RGB, so feature extraction seems to work. After all, to make the extraction code independent on OpenCV, it just uses float* pointers. Then the current mainSift.cpp assumed that both images were of the same size. That doesn't need to be the case. I simply used cv::transpose and cv::flip to reorient one of the images. With thresh=4.0 and initBlur=1.0 I then get:

Number of original features: 2269 1933 Number of matching features: 351 526 18.1583% 1 4

fortuneko commented 5 years ago

Thanks @Celebrandil, I have coverted the input image to gray using the code below, the input to the sift implementation is the same as the github version except the image size

  cv::Mat img_src1 = read_image(strImg1);
  cv::Mat img_src2 = read_image(strImg2);  
  cv::Mat gray1,gray2;
  cv::cvtColor(img_src1,gray1,cv::COLOR_BGR2GRAY);
  cv::cvtColor(img_src2,gray2,cv::COLOR_BGR2GRAY);
  gray1.convertTo(limg,CV_32FC1);
  gray2.convertTo(rimg,CV_32FC1);

I have tested by roating img2 with 90 degree to run the code and get correct result as you mentioned above. You said that "Then the current mainSift.cpp assumed that both images were of the same size" ,Why this restriction required?Is the buffer layout allocated inside the internal implementation of the first image will also used for the second image, if not consistent bad behavior would raise?

Celebrandil commented 5 years ago

Since memory allocation is so costly, I added a function called AllocSiftTempMemory that preallocates memory. If you have a real-time system with images coming from a video camera, you might save a lot of time by doing this allocation once and for all, by reusing the same preallocated memory. You could still use ExtractSift without a pointer to preallocated memory, but then memory is instead allocated every time you run ExtractSift. The amount of memory necessary depends on the dimension of the image and in your case the two images had different dimensions.

fortuneko commented 5 years ago

Hi @Celebrandil ,sorry to reply so late I have test the images displayed in the first post of this issue, the two images are of different sizes, but mainSift.cpp works fine on them. IF the different dimension could cause problem, why this two images works fine?

Celebrandil commented 5 years ago

I think it depends on the pitch of allocated images. I believe that the number of pixels in width has to be divisible by 128, so during allocation on the GPU the width of images is extended with a dummy area to make that true.