sadit / SimilaritySearch.jl

A nearest neighbor search library with exact and approximate algorithms
https://sadit.github.io/SimilaritySearch.jl/
MIT License
42 stars 7 forks source link

have some benchmark ? #9

Closed zsz00 closed 2 years ago

zsz00 commented 3 years ago

I use cosine distance , search , same vector cos is 0.93, not 1.0.

is there have some benchmark of accuracy and speed???

sadit commented 3 years ago

Hi, the benchmark comes in the paper. In general, you may want to try to match your dataset's characteristics with one of the presented benchmarks to try to figure out how it will perform.

Tellez, E. S., Ruiz, G., Chavez, E., & Graff, M.A scalable solution to the nearest neighbor search problem through local-search methods on neighbor graphs. Pattern Analysis and Applications, 1-15.

I assume that you are using the SearchGraph index. The performance will be linked with the entire collection and parameters used for the index construction. Perhaps if you give me more information I can figure out if it is the expected performance or there is a bug somewhere.

zsz00 commented 3 years ago
feats = [Float32[0.03297681, -0.057883024, 0.033558328, 0.027213005, 0.019179631, -0.016161758, 0.06313992, 0.029769272, 0.0027594056, 0.13871056, -0.03642421, 0.035063893, 0.052731786, -0.096194535, 0.025367929, 0.09652164, 0.03355077, -0.14058569, -0.024607437, 0.0385051, -0.015709603, -0.00084702333, 0.010407126, 0.07308541, -0.14650826, 0.034957312, -0.036341447, 0.036387708, 0.07455429, -0.029971035, 0.009414315, -0.031715095, 0.026172841, 0.012265139, 0.07626581, 0.042138305, -0.032860383, -0.046434045, 0.016199734, -0.016743844, -0.019707583, 0.070256956, 0.018621296, -0.018117145, 0.04519324, -0.008631748, 0.031046357, -0.08320625, 0.03610247, 0.028911201, 0.07596006, 0.042562727, -0.021158278, 0.043864492, -0.099275984, -0.082358874, -0.053784493, -0.0019726513, -0.0131031005, -0.017219594, -0.019278266, 0.040323827, 0.014632566, 0.002580932, 0.06679814, -0.04631771, 0.0027479252, -0.07761896, -0.09728193, 0.06973371, -0.047187775, 0.0016626044, 0.0042865197, -0.051342975, -0.044983022, -0.03356067, 0.018128594, -0.019438025, 0.07738148, 0.054820396, 0.020017399, 0.064808, 0.045719944, -0.021417692, 0.049125448, 0.11132416, 0.09583403, -0.0326005, 0.0237151, -0.05385886, -0.09413928, -0.057225835, 0.023238016, 0.018155275, -0.05529222, -0.0524575, -0.035687137, -0.034902997, 0.0004564388, 0.016682647, 0.009229477, 0.05417781, 0.022336923, 0.04452578, -0.038591694, -0.0116358865, -0.030728593, -0.082244925, -0.048479147, -0.08643628, -0.15386823, -0.046966404, 0.030156834, 0.035275202, -0.026208974, -0.008127876, -0.05740735, -0.07329228, -0.050228886, -0.078125425, -0.03463436, -0.036234498, -0.030027932, -0.015998986, -0.028561026, 0.03689839, 0.028624682, -0.033904932, 0.029430186, 0.018231826, 0.0131820375, -0.031087432, -0.07719094, 0.020103004, -0.03539716, 0.027130332, -0.07232081, 0.08984109, 0.042997424, 0.081979744, 0.014138508, 0.006337732, 0.06327166, 0.038113058, -0.018700251, 0.06819902, -0.015662037, 0.049290992, -0.0262096, -0.01933094, -0.019636787, -0.11263639, 0.064283244, 0.02150052, 0.06362041, 0.12017935, 0.04358417, 0.04641462, -0.031646453, -0.12216171, -0.07153132, -0.01625553, 0.04021505, -0.13993458, -0.016416347, -0.036144312, 0.07803476, -0.02478821, 0.030736346, 0.06713965, 0.017525254, 0.04214318, -0.038028713, -0.024363687, 0.010754908, -0.05552812, -0.019086815, 0.053397045, 0.042078003, -0.054524723, 0.083423615, -0.088865966, -0.05497311, -0.06738962, -0.016410915, 0.007872382, -0.019986248, 0.06106768, -0.004407266, 0.008798337, -0.033300046, 0.0010995292, -0.023627475, -0.06087968, 0.05924407, 0.1484687, 0.0231129, -0.02084738, 0.03419504, -0.026282031, 0.022702813, 0.03565465, 0.08763111, -0.05291037, -0.04500102, -0.10894275, -0.0631577, 0.08069645, 0.00038371963, 0.042618483, 0.009306141, 0.06220005, -0.0014686136, 0.082849026, -0.015267986, -0.06427518, 0.03388651, 0.043900434, -0.0424782, -0.011266299, 0.010770118, -0.019211786, -0.036401853, -0.04233852, -0.0038164672, 0.0042537525, -0.070206665, 0.013972397, 0.021965098, -0.017734686, 0.0040993933, 0.067220226, -0.00787797, 0.09254696, -0.009852315, -0.010696956, 0.022217065, 0.08539462, -0.02249066, -0.0017671093, -0.0123101855, -0.007119201, 0.01978003, -0.014046708, -0.060128625, 0.001505266, -0.13438992, -0.06156853, 0.02358169, -0.005255983, -0.0021809016, -0.056115717, 0.1302595, 0.062802166, -0.012108129, -0.0083336765, -0.039311938, 0.017205114, 0.030251442, -0.010780153, -0.06744799, -0.025359083, 0.0033256032, -0.008411589, -0.009328704, 0.008136924, -0.024580734, 0.022388328, -0.09883228, 0.036954563, -0.04340898, 0.03953776, 0.007724884, 0.0023085447, 0.018347722, 0.03419157, -0.10366755, 0.03269607, 0.020940358, -0.019155366, 0.008019562, 0.0606313, -0.11178268, 0.044918273, -0.05629753, 0.11620583, 0.0018178058, -0.01571131, -0.048517283, -0.007962609, 0.011047748, 0.04671232, 0.014219825, 0.05203198, -0.052725866, 0.05879014, -0.040847775, 0.004481263, -0.0085637085, 0.016696256, 0.0019668574, -0.06586882, 0.062369715, -0.034729548, 0.030646628, -0.094832376, 0.028972266, -0.02190803, -0.07594067, 0.058789678, -0.023300344, 0.06303179, 0.02682792, 0.0058673997, -0.05280468, 0.031843856, -0.0305178, 0.045582585, -0.017856808, -0.0002522993, 0.028839096, -0.05136405, -0.05138706, -0.064909846, -0.075800344, -0.04568114, -0.0077366666, 0.078780785, -0.07766363, 0.028032925, -0.061632436, 0.019672375, 0.022018587, -0.058356024, 0.024950681, -0.089778826, -0.023793645, -0.033216294, 0.023592308, -0.094127625, 0.00080671115, 0.05131781, 0.02985638, 0.055004388, 0.0031268941, 0.018598935, 0.056460783, 0.07279418, 0.009878396, -0.07972289, -0.04427835, -0.037287485, -0.028881252, -0.06880523, -0.039759204, -0.0487844, -0.02592025, 0.00093433907, -0.03126142, -0.05605427, 0.054715004, 0.058923576, -0.038416088, -0.008110212, 0.0911419, 0.019715032, -0.020466564, 0.047065876, -0.07039487, 0.0030142511, -0.03585896, -0.035483643, 0.04384703, 0.052533075, 0.008147111, 0.046549536, 0.025659567, 0.056723017, 0.031602047, 0.017201405, 0.06583447, -0.05072189, 0.03619238, -0.044715263], Float32[0.015765615, -0.06544561, 0.05014449, 0.04103299, 0.0047535934, -0.009305971, -0.002321235, 0.0145526845, -0.037340507, 0.14821921, -0.014751336, 0.0031405403, 0.002831885, -0.067847036, 0.025207063, 0.07481268, 0.06578598, -0.09919201, -0.00502795, 0.022660354, -0.036307026, 0.020476377, 0.053380474, 0.066640936, -0.102308415, 0.03921583, -0.059073623, -0.0016290378, 0.033615492, -0.047188908, -0.0045820763, 0.0011260095, 0.011906111, -0.031214219, 0.020265859, 0.016510773, -0.048125308, -0.017855493, -0.0021458862, -0.04628227, -0.028676137, 0.094524674, 0.025453577, -0.030119222, 0.029343037, -0.04922619, 0.021317739, -0.037603788, 0.049407225, -0.0015379465, 0.064156614, 0.0739506, -0.0077899154, -0.037546925, -0.099937566, -0.048427913, -0.012340679, 0.023650056, 0.043242346, 0.00556433, -0.05344209, 0.07786066, 0.026834713, 0.05571254, 0.013766646, -0.028954756, -0.019103905, -0.059881136, -0.08982095, 0.06856636, -0.06434762, -0.054534607, 0.005254839, 0.036782146, -0.015208425, -0.01209943, -0.0016069838, -0.026163498, 0.034333445, 0.037850086, 0.03327922, 0.023703406, 0.013869692, -0.07379843, 0.050091438, 0.05173519, 0.09041091, 0.021897063, 0.014986686, -0.00832175, -0.0585019, -0.0056225946, 0.08275092, 0.07330131, -0.08055604, -0.022590358, -0.0038006448, -0.056285255, 0.007956371, 0.04973228, -0.0032143977, 0.06277225, 0.011632446, 0.023843158, -0.0018425008, 0.0036203007, -0.010520928, -0.07859694, -0.023705892, -0.08433195, -0.16023763, -0.0033853492, 0.067396864, 0.1136283, 0.006980624, 0.009815873, -0.094278716, -0.057675224, -0.107671216, -0.037254527, -0.026330106, -0.03513074, 0.0023579758, -0.0044905934, -0.079859845, 0.04994282, 0.022468375, -0.012306701, -0.02101435, 0.057674374, 0.012345393, -0.080476895, -0.036807306, -0.018524695, -0.09245408, 0.04253927, -0.040490404, 0.15053166, -0.0033400222, 0.09582514, 0.0022741992, -0.0027939542, 0.034451082, 0.06663384, -0.018416302, 0.06232047, 0.011507966, 0.045887873, 0.011271796, -0.040688872, -0.04439182, -0.077464744, 0.03192745, 0.062089425, 0.064341895, 0.10484491, 0.06565623, 0.033359732, -0.05501805, -0.12348311, -0.018805075, -0.051640473, 0.0446357, -0.09156433, 0.022881309, -0.0011171047, 0.03112962, -0.0411782, 0.031896006, 0.03610315, 0.05030521, 0.004917727, -0.06809024, -0.07147824, 0.05921659, -0.09241513, 0.02456306, 0.06292988, 0.040832236, -0.02400271, 0.11244889, -0.06261039, -0.007906211, -0.034351792, -0.005431736, -0.0031198682, -0.010489706, 0.07297288, -0.0007951786, 0.021849453, -0.03419571, -0.01750842, -0.05495765, -0.098299436, -0.0013097154, 0.14454794, 0.0108173415, -0.0671987, 0.012570386, -0.02554265, 0.069785155, -0.011455426, 0.058788903, 0.02041638, -0.059928153, -0.16198048, -0.05532882, 0.054410778, 0.013052701, 0.040802605, 0.00482555, 0.06080886, 0.019105604, 0.12867406, -0.009832493, -0.03728203, 0.040210757, 0.041179355, -0.041350894, 0.015404446, -0.04104166, -0.06263489, -0.010370178, -0.035870288, -0.004245199, 0.015671467, -0.10455296, 0.034320813, 0.037645232, 0.016582146, 0.005453909, 0.06433356, 0.021903925, 0.059860483, -0.032533605, -0.04924855, -0.0032162827, 0.12031688, -0.01628358, 0.0033605956, -0.024405029, -0.035103343, -0.004280421, 0.043936186, -0.029409243, -0.057588555, -0.05146024, -0.040288005, 0.007139886, 0.02564985, 0.017752003, -0.031965822, 0.13167402, 0.06845927, 0.020005267, 0.008228795, -0.063035384, 0.020824814, 0.042559437, -0.007315301, -0.054534182, -0.041081764, -0.031523876, -0.018796956, -0.018402804, 0.011738635, -0.045068875, 0.011872364, -0.055326827, 0.03462909, -0.024699626, 0.07151482, -0.021301629, 0.028988935, 0.03766272, 0.057998113, -0.05067544, 0.091983065, 0.036934797, 0.00084885955, 0.008317827, 0.031420927, -0.08069318, 0.1017884, 0.0013032897, 0.056665916, 0.034172688, -0.04694873, 0.019728772, -0.024167685, -0.012011493, 0.053518556, -0.092346966, 0.08430573, -0.07195072, 0.03218724, -0.08942784, 0.0051342943, 0.02707994, -0.048965167, -0.009606259, -0.060828783, 0.042530134, -0.05416876, 0.011649486, -0.11871809, 0.03194491, -0.060213603, -0.078449674, 0.035024, -0.039280936, 0.05916157, -0.01034436, -0.046268467, -0.0373018, 0.006011359, -0.07544088, 0.03127143, -0.004759758, -0.005884532, 0.0034392755, -0.08166817, -0.02376286, -0.032469537, -0.070167266, -0.027715893, 0.013053486, -0.0010710846, -0.04462494, 0.029345993, -0.0811242, 0.0014812981, 0.02640128, -0.03403988, 0.010286243, -0.07726746, -0.059592042, -0.043900605, 0.019237736, -0.05647182, 0.012375716, -0.019085156, 0.057849728, 0.008538725, -0.031636443, -0.020497132, 0.068351984, 0.042509705, -0.019954558, -0.10951709, -0.060860682, -0.0138538405, 0.010292368, -0.025065705, -0.012173789, -0.10461745, -0.042994004, 0.031564716, 0.021058995, -0.0034713948, 0.005291961, -0.010759215, 0.016482351, 0.009246774, 0.05033589, 0.036353055, -0.06527257, -0.019742554, -0.10852433, 0.012276394, -0.064130194, -0.053854097, 0.015483599, 0.04576672, -0.022158612, 0.023992984, 0.06567918, 0.008426455, -0.027813364, 0.01571122, 0.0032262753, -0.03820341, -0.007184484, -0.0036252302]]
# feats is Normalized 
topk = 2
gallary = feats
query = feats
index = ExhaustiveSearch(CosineDistance(), gallary, KnnResult(topk))  
out = [search(index, q, KnnResult(topk)) for q in query]
println(out)

KnnResult[KnnResult(2, 2, Item[Item(1, -1.1920929f-7), Item(2, 0.20484513f0)]), KnnResult(2, 2, Item[Item(2, 0.0f0), Item(1, 0.20484513f0)])]

why is 0.20484513, not is 1.0 ?

sadit commented 3 years ago

The indexes in the package use dissimilarities (or distances) instead of similarities, then CosineDistance computes 1 - cos(.).

I think that the results are correct/expected (small negative distances are due to floating-point arithmetic)

julia> evaluate(CosineDistance(), feats[1], feats[1])
-1.1920929f-7

julia> evaluate(CosineDistance(), feats[2], feats[2])
0.0f0

julia> evaluate(CosineDistance(), feats[1], feats[2])
0.20484513f0
zsz00 commented 3 years ago

ok, i see, thank you very much...