Celebrandil / CudaSift

A CUDA implementation of SIFT for NVidia GPUs (1.2 ms on a GTX 1060)
MIT License
861 stars 286 forks source link

Non-Deterministic result of CudaSift #40

Open Eniac-Xie opened 6 years ago

Eniac-Xie commented 6 years ago

Thank you for this implementation!

When I test the CudaSift, I found the result is non-deterministic, that is, for the same image, the program outputs different feature. Is it abnormal?

For example, I compile and run mainSift.cpp, and set the max sift num to 3, the result are: Run1 xpos, ypos, score, descriptor[0], descriptor[1], descriptor[2] 1663.008179 215.823730 149.000000 0.001161 0.047922 0.060676 1235.375366 263.208405 111.000000 0.000000 0.059338 0.047047 1590.166992 1039.981323 201.000000 0.007714 0.001538 0.000000

Run2 xpos, ypos, score, descriptor[0], descriptor[1], descriptor[2] 1663.008179 215.823730 149.000000 0.001161 0.047922 0.060676 1660.701904 254.537323 111.000000 0.001048 0.023971 0.046532 1590.166992 1039.981323 201.000000 0.007714 0.001538 0.000000

colinlin1982 commented 6 years ago

set the max sift num to 2048 or bigger.

Eniac-Xie commented 6 years ago

CudaSift did NOT sort the descriptors by score. So did you mean that when I set the max_sift_num less than 2048, CudaSift randomly selects max_sift_num points and output? @colinlin1982

colinlin1982 commented 6 years ago

Right. The max sift num is determined by your pictures. Bigger size, or more complex texture means more points.

Celebrandil commented 6 years ago

Is it just the order of features that is different or is it the features themselves? Since features are extracted in parallel, you never know in which order they will end up. It's true though that if you haven't allocated enough GPU memory for storing features, you might miss some strong ones that have to be ignored during extraction. By the way, I could add a post-processing function to sort features based on DoG strength, so that you could allocate enough features on the GPU side, but only read back the strongest one to the CPU side, a process that is typically quite costly.

Eniac-Xie commented 6 years ago

It seems that the descriptors are orderless, I try to sort them by the SiftPoint.score, but the score of the point/descriptor are different for different extraction.For example, the score of key point(x=298.352 y=35.8827 scale=3.978 orientation=269.656) could be 0.99986 in one execution but be 0.999999 for another execution.

Celebrandil commented 6 years ago

SiftPoint.score is only used for feature matching. If you match features and it changes like that, then something is definitely wrong. If you don't match, then you should instead use SiftPoint.sharpness and I should add come comments to cudaSift.h.

Eniac-Xie commented 6 years ago

Thank you. Another question: How to allocated enough GPU memory for storing features?

I execution the mainSift.cpp demo and run ./build/cudasift, but the number of sift points is different for different execution.

Celebrandil commented 6 years ago

It's done in InitSiftData. The reason for not doing that as part of ExtractSift is because cudaMalloc is so terribly costly. It easily takes more time than the extraction itself. With InitSiftData you reserve memory that you might or might not use fully. If you extract features from multiple images, you might reuse an already allocated SiftData structure to save allocation time.

jairedc commented 3 years ago

Is it just the order of features that is different or is it the features themselves? Since features are extracted in parallel, you never know in which order they will end up. It's true though that if you haven't allocated enough GPU memory for storing features, you might miss some strong ones that have to be ignored during extraction. By the way, I could add a post-processing function to sort features based on DoG strength, so that you could allocate enough features on the GPU side, but only read back the strongest one to the CPU side, a process that is typically quite costly.

Any updates on adding this sorting function?