ekzhu / SetSimilaritySearch

All-pair set similarity search on millions of sets in Python and on a laptop
Apache License 2.0
589 stars 40 forks source link

all_pairs: get *all* pairwise indices, including 0? #19

Open Antonia-Schmidt opened 11 months ago

Antonia-Schmidt commented 11 months ago

I noticed that not all possible pairs are returned when I tried the all_pairs function and that even with threshold = 0.0, the lowest values are still above 0. Is there a possibility to get all pairwise similarities?

ekzhu commented 10 months ago

That's odd. It may be a bug. Can you post a code to reproduce this?

Antonia-Schmidt commented 10 months ago

Sure, the minimal example below already returns an empty list instead of a list with one element.

sets = [[1,2,3], [4,5,6]]
pairs = all_pairs(sets, similarity_func_name="jaccard", 
        similarity_threshold=0.0)
list(pairs)