cayleygraph / cayley

An open-source graph database
https://cayley.io
Apache License 2.0
14.86k stars 1.25k forks source link

ForAll / Universal Quantifiers #209

Open barakmich opened 9 years ago

barakmich commented 9 years ago

We have a Not iterator now. It'd be possible to add a shorthand macro in Gremlin to emulate the Universal Quantifier, ForAll.

All queries currently work on existence. That is, each stage depends on "there exists some path X to node Y". Using negation you can achieve "for all paths from X, (rest of query holds)".

The key insight being that (X holds for (ForAll A)) == (X holds for A) AND (X holds for (Not (Not A))) == (X holds for (A AND (Not (Not A))))

In today's parlance, this is: var a = g.M().... g.V().And(a).And(g.V().Except(g.V().Except(a)))...

It's a hack, but one that holds mathematically. Currently can be implemented as a macro on the query level, and is actually really hard to optimize further. (Suggestions, of course, are welcome)

barakmich commented 9 years ago

A real-world example being unanimous voting: "For all voters on this vote, the vote is yes". (Put another way, "There is at least one yes voter, and there is not a no voter). Or perhaps, for, say "All-Californian co-ops" (People who live in a co-op, who have at least one Californian member, and do not contain any non-Californian members).

The API may be cleaner in other ways. But the macro is a simple way to get started.

allonhadaya commented 9 years ago

@barakmich Out of curiosity, is the test, "There is at least one satisfying path" needed given how ForAll behaves on the empty set?

barakmich commented 9 years ago

@allonhadaya Precisely. Vacuous truth is a weird thing; it often means returning a whole bunch of unrelated nodes. That said, if implemented by the book, it would also be correct -- so I could see a case for either.

allonhadaya commented 9 years ago

@barakmich I see, we would be obliged to include all those paths for which the condition is satisfied, but not very meaningful (vacuously true). It's hard for me to reason about which interpretation makes more sense for all applications.