Rares' FuzzyJoin will run the following query with the default similarity
threshold 0.8f:
use dataverse fuzzyjoin;
for $i in dataset('DBLP')
for $j in dataset('CSX')
where similarity-jaccard(word-tokens($i.title), word-tokens($j.title)) >= 0.5f
order by $i.id, $j.id
return {'dblp': $i, 'csx': $j}
Original issue reported on code.google.com by icetin...@gmail.com on 18 Sep 2013 at 11:07
Original issue reported on code.google.com by
icetin...@gmail.com
on 18 Sep 2013 at 11:07