Open GoogleCodeExporter opened 9 years ago
I managed to run this query using FuzzyJoinRule instead of Nested Loop Join.
However, it returns the following extra line which does not exist in the
expected result set:
{ "dblp": { "id": 21, "dblpid": "books/acm/kim95/MengY95", "title": "Query
Processing in Multidatabase Systems.", "authors": "Weiyi Meng Clement T. Yu",
"misc": "2002-01-03 551-572 1995 Modern Database Systems
db/books/collections/kim95.html#MengY95" }, "dblp2": { "id": 24, "dblpid":
"books/acm/kim95/OzsuB95", "title": "Query Processing in Object-Oriented
Database Systems.", "authors": "M. Tamer Özsu José A. Blakeley", "misc":
"2002-01-03 146-174 1995 Modern Database Systems
db/books/collections/kim95.html#OzsuB95" } }
I think this should be in the result set since the similarity is
intersection size / union size
= [Query, Processing, in, Systems] / [Query, Processing, in, Multidatabase,
Systems, Object, Oriented, Database]
= 4 / 8 = 0.5.
It wasn't complaining when it was running as NL join. Then does it mean we have
a problem in the expected results, and NL Join, or am I missing something?
Original comment by icetin...@gmail.com
on 13 Sep 2013 at 9:42
Sounds like a problem, indeed! The NL join is using functions that are passed
to it by the compiler to do the joining - so there may be a problem there, also
related to this, then. Apparently the fuzzy functions used there are
inconsistent with those used in this case? BUG!
Original comment by dtab...@gmail.com
on 13 Sep 2013 at 5:12
Original comment by icetin...@gmail.com
on 26 Sep 2013 at 12:30
Original issue reported on code.google.com by
icetin...@gmail.com
on 23 Aug 2013 at 7:57