Knowledge-Graphs-Book / HTML-Book

Other
36 stars 7 forks source link

Bug in Definition 2.11 - Path evaluation (directed edge-labelled graph) - r*[G] #49

Closed matteopalmonari closed 1 year ago

matteopalmonari commented 1 year ago

In the definition of the evaluation for r*[G], the definition reports:

I discussed this with Aidan. He provides an answer that I copy here because it would be good to add some explanation about why using the * Kleene operator (zero or more path, instead of one or more paths), in addition to fixing the definition.

Correct definition:

Explanation (to revise and add to the book):

The idea is that for the zero-length path, we should also return all nodes in the graph connected to themselves (the "epsilon" case). This is the difference between Kleene star (path of zero or more) and Kleene plus (one or more). To explain, it does seem a bit silly to return these (u,u) pairs, but in a query like:

?s rdf:type/rdfs:subClassOf* ex:Building .

it's quite nice as it returns those things that are directly instances of buildings, because we will have (ex:Building,ex:Building) in the evaluation of rdfs:subClassOf*. So often the zero case is already "bound" to something that is copied to the output, rather than returning (u,u) for all u in V. But in the base case we define there, it's not bound to anything yet.

Cheers Matteo

aidhog commented 1 year ago

Thank you once again Matteo for spotting this! Seems it took me almost a month to get to it, but I've fixed it in the book (at kgbook.org). Fixed in the LaTeX, PHP and HTML. @Antoine-Zimmermann , you may want to incorporate the commit linked above in the French version?