Open handsomeZhuang opened 1 month ago
Thanks for the report, could you give a reproduction example ?
Thanks for the report, could you give a reproduction example ?
yes,for example: “a = np.array([[70883900,42568368,16938844,55760336, 21177010,83098300,46080616,13810740,63454444,20485222], [20347602, 27256056, 23762382, 61982300, 37474148, 5487983, 7732985, 15258728, 68216584,16599308], ]).astype(np.float32) b = np.array([[20635302, 42568368, 16938844,55760336, 65016728, 830983, 46080616, 13810740, 63454444, 2048522.]]).astype(np.float32) index = faiss.index_factory(a.shape[1],'Flat', faiss.METRIC_Jaccard) index.train(a) index.add(a) dist, id = index.search(b, 1) ‘’ this anwser is a[0,:] instead of a[1,:], but the return is a[1,:] , we look forward to your reply~
Thanks for the report, could you give a reproduction example ? Hi, Is the question confirmed?
hi,I have sent an example to you in GitHub issue, please check it~
------------------ Original ------------------ From: Matthijs Douze @.> Date: Tue,Jun 18,2024 0:26 AM To: facebookresearch/faiss @.> Cc: handsomeZhuang @.>, Author @.> Subject: Re: [facebookresearch/faiss] jaccard metric prob (Issue #3521)
Thanks for the report, could you give a reproduction example ?
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>
Thanks for the report, could you give a reproduction example ?
hi, dear, is this prob confirmed? we look forwar to you reply and wait for this lib to finish the future work, we would approciate it if you could confirm this prob! thank you ~
Do you have a reference implementation of the Jaccard metric to compare with?
Do you have a reference implementation of the Jaccard metric to compare with?
yes, we use the scipy.distance lib to test it. max_id = -1 max_score = -1 for i in range(a.shape[0]): diff = np.bitwise_and((a[i,:] != b), np.bitwise_or(a[i,:] != 0, b != 0)).sum() temp = b.shape[1] - diff union = np.double(np.bitwise_or(a[i,:] != 0, b != 0).sum()) score = float(temp / union) if max_score < score: max_score = score max_id = i print(max_id,max_score)
Do you have a reference implementation of the Jaccard metric to compare with?
hi,please aske whether it has been debuged or not?
Hi, Dear Development Team, We have recently used “faiss.index_factory(dim,'Flat', faiss.METRIC_Jaccard)" and index.search() to create index and query, then found the result is not precise. We also found that the implementation of faiss source is different from that of scipy lib,but scipy lib is the same with original Jaccard method. We look forward to your reply! Best wishes~