Celebrandil / CudaSift

A CUDA implementation of SIFT for NVidia GPUs (1.2 ms on a GTX 1060)
MIT License
844 stars 284 forks source link

Suggestions for future versions? #39

Open Celebrandil opened 5 years ago

Celebrandil commented 5 years ago

After the latests commits I'm running out of ideas of what to improve and would like to hear if anyone has any suggestions for future versions. For further speed improvements, I can see the possibility of adding functionalities for uploading images that are not necessarily in floats, using half precision floats for storage and matching of SIFT vectors, as well as projecting vectors to a lower dimension, similar to PCA-SIFT. In most practical scenarios though, gaining a fraction of a millisecond doesn't help much, since there is much more around it that is more important. Thus the nature of the end application becomes more important than the actual feature extraction code.

colinlin1982 commented 5 years ago

Maybe cuda affine sift?

Celebrandil commented 5 years ago

You mean [Yu & Morel, 2011], not [Mikolajczyk & Schmid, 2002], right? Do you have a feeling for how many angles you need to test in practice?

colinlin1982 commented 5 years ago

First question: yes. Second question: int ml = 6; //max, 3 works fine for most pictures. for(int tl = 1; tl < ml; tl++) { double t = pow(2, 0.5*tl); for(int phi = 0; phi < 180; phi += 72.0/t) { // extract sift points... } } 41 angles if ml=6. I restrict total sift point number to 16000, other wise matching costs too much time.

Celebrandil commented 5 years ago

It's really a brute force method and I'm not sure it's the most effective one. The DoG responses might be more invariant to rotations than the descriptors, which means that just to detect points, you could use a sparser sampling of angles. Once you have detected a DoG point, you then do some further sampling of angles for the descriptor. How much overlap do you have of points from different angles, but at exactly the same location in the original image? Do you prune these points or do you keep them all through the matching and then possibly prune afterwards? On the other hand, even with the current version, a brute force method might be sufficient in most cases, in particular if some optimisations are done for image rotation and buffering of feature data. Getting fast enough matching should become a greater concern and for that some kind of pruning might become important.

smremde commented 5 years ago

Cuda Stream support?

isgursoy commented 5 years ago

What about OpenCV interoperability?