biscuit-auth / biscuit

delegated, decentralized, capabilities based authorization token
Apache License 2.0
958 stars 26 forks source link

Laziness of boolean operators #129

Open divarvel opened 1 year ago

divarvel commented 1 year ago

Evaluating some expressions trigger an evaluation failure (for instance ill-typed expressions, or arithmetic operations that trigger over/underflows).

The specification does not explicitly say what is the expected behaviour:

The || operator is usually lazy on its second argument, even in strict languages. I think we should do the same, following the principle of least surprise.

The same applies for or (|| is a binary operator allowing to take the disjunction of 2 booleans, whereas or works with rule bodies).

Once the matter is settled, we need to add test cases in the conformance suite to make sure all implementations are consistent.

Currently, in the rust implementation, or is lazy on its second argument, while || is strict on both its arguments.

Making the stack evaluator ~lazy~ non-strict

Datalog expressions are evaluated with a stack machine whose semantics are inherently eager: the expression tree is computed from the bottom up, so the boolean operators are evaluated after both sides have been evaluated.

It is possible, however, to make the evaluation non-strict: instead of not evaluating the rhs of true || or false &&, we can evaluate the right hand side, and discard it (even if evaluating it resulted in an error). So the key part is to not abort computation in the case of an error, because a boolean evaluator higher up could discard the error branch altogether. Since datalog expression evaluation cannot have side effects, the outcome is similar to a true lazy evaluation.

One way to do that is to have the stack carry not only sucessfully evaluated values, but also errors. In rust that would mean a Stack<Result<Term,Error>> instead of a Stack<Term>. Evaluating an eager operation on an Err value results in popping this value (or those two values in the case of a binary operation), and pushing it back to the stack. For non-strict operations however (&& and ||), it means popping both left and right values from the stack, then:

The expression "test".type() == "int" && "test" > 0, (with $test will translate to the following opcodes: "test", TypeOf, "int", ==, "test", 0, >, && Here is how it would be executed:

Op     | stack
       | [ ]
"test" | [ Ok("test") ]
TypeOf | [ Ok("string") ]
"int"  | [ Ok("int"), Ok("string") ]
==     | [ Ok(false) ]
"test" | [ Ok("test"), Ok(false) ]
0      | [ Ok(0), Ok("test"), Ok(false) ]
>      | [ Err(TypeMismatch), Ok(false) ]
&&     | [ Ok(false) ]
Tipnos commented 1 year ago

From my point of view this PR is bound with PR https://github.com/biscuit-auth/biscuit/issues/130 (Heterogeneous == ). I think if you decide to not enforce that "strict evaluation, where having a non-computable expression aborts the whole evaluation". I think it's important also to accept "equality on heterogeneous terms" to stay consistent. The specification will be more intuitive.

Moreover the absence of type checking and compilation is an argument in favor of that. It would be weird to have unexpected "type checking" errors at runtime and no tools to prevent them upstream.

Geal commented 1 year ago

I agree, it should be lazy, that's the most commonly expected behaviour

divarvel commented 1 year ago

i have a proof of concept here https://github.com/biscuit-auth/biscuit-rust/pull/188

the proof-of-concept makes them non-strict rather than lazy. The difference is subtle (and not observable from the outside in the absence of side-effects), but still important to keep in mind.