biscuit-auth / biscuit

delegated, decentralized, capabilities based authorization token
Apache License 2.0
961 stars 25 forks source link

Datalog breaking changes #147

Open divarvel opened 1 year ago

divarvel commented 1 year ago

Now that biscuit has been deployed more widely, we have a clearer vision of future needs and missing features.

In particular, the datalog expressions have a few limitations that we can't overcome in a purely additive fashion:

Both of these changes can be mapped to new datalog primops, so existing blocks would still be evaluated the same way. The only thing that would change would be the parser. All datalog code that uses these features (||, && and []) would then generate datalog blocks with 5 as version number. Everything else would still generate v3 or v4 blocks. Existing blocks would be evaluated in the same way.

Since this requires breaking changes in the parser, it is a good opportunity to evaluate other ambitious additions that might also require syntactical breaking changes:

This is not to say that all these features should ship with the next version, but we should at least consider their impact on existing syntax and semantics to make sure shipping them won't require extra breaking changes.

On the matter of block versions

So far, version bumps have been fairly easy, as they were purely additive: evaluation semantics were not dependent on the block version. Rather, the block versions was derived from the block contents, and allowed implementations to reject blocks with unknown operations when deserializing protobuf.

The changes we are considering carry a semantic change that cannot be detected from the contents itself. They also carry parsing changes, even though in the past we have considered the textual representation out of scope of block versions, since it's more of a concern of interpreting code rather than tokens.

So we need a way to handle the semantics change properly (in descending order of importance):

A possible solution

Instead of changing the meaning of And and Or operators, introduce new LazyAnd and LazyOr operators with non-strict semantics. This allows us to keep the existing version system (ie choose the lowest version number possible . So there would not be a breaking change in the datalog evaluation engine, but only in the parser, which is covered by the library version number, not by the token blocks version number. In addition to have && || parse as LazyAnd LazyOr, we could introduce a new textual representation for the strict And / Or (eg &&!). This representation would not be necessarily parseable, but would allow new library versions to disambiguate.

This would satisfy the 3 conditions listed above. A variant would be to not change the meaning of && and ||, but to introduce lazy versions (eg &&~), but I think that's too conservative and would make code harder to read.

Note: ! for strictness and ~ from laziness is a convention lifted from haskell.