Now that biscuit has been deployed more widely, we have a clearer vision of future needs and missing features.
In particular, the datalog expressions have a few limitations that we can't overcome in a purely additive fashion:
boolean logic operators should be lazy #129
the choice of [] for sets is rather unfortunate #135
Both of these changes can be mapped to new datalog primops, so existing blocks would still be evaluated the same way. The only thing that would change would be the parser. All datalog code that uses these features (||, && and []) would then generate datalog blocks with 5 as version number. Everything else would still generate v3 or v4 blocks. Existing blocks would be evaluated in the same way.
Since this requires breaking changes in the parser, it is a good opportunity to evaluate other ambitious additions that might also require syntactical breaking changes:
support for arrays and maps with query semantics #135
support for nullability #148
support for type inspection #134
support for higher-order functions and lambdas (eg forAll / forSome on arrays)
support for heterogeneous sets #140
support for heterogeneous equality #130
This is not to say that all these features should ship with the next version, but we should at least consider their impact on existing syntax and semantics to make sure shipping them won't require extra breaking changes.
On the matter of block versions
So far, version bumps have been fairly easy, as they were purely additive: evaluation semantics were not dependent on the block version. Rather, the block versions was derived from the block contents, and allowed implementations to reject blocks with unknown operations when deserializing protobuf.
The changes we are considering carry a semantic change that cannot be detected from the contents itself. They also carry parsing changes, even though in the past we have considered the textual representation out of scope of block versions, since it's more of a concern of interpreting code rather than tokens.
So we need a way to handle the semantics change properly (in descending order of importance):
make sure that once minted, a given block will always be evaluated the same way
maximize compatibility with older library versions by striving to generate blocks with the lowest version possible
avoid codebase complexity by avoiding too much code duplication
A possible solution
Instead of changing the meaning of And and Or operators, introduce new LazyAnd and LazyOr operators with non-strict semantics. This allows us to keep the existing version system (ie choose the lowest version number possible
.
So there would not be a breaking change in the datalog evaluation engine, but only in the parser, which is covered by the library version number, not by the token blocks version number. In addition to have && || parse as LazyAnd LazyOr, we could introduce a new textual representation for the strict And / Or (eg &&!). This representation would not be necessarily parseable, but would allow new library versions to disambiguate.
This would satisfy the 3 conditions listed above. A variant would be to not change the meaning of && and ||, but to introduce lazy versions (eg &&~), but I think that's too conservative and would make code harder to read.
Note: ! for strictness and ~ from laziness is a convention lifted from haskell.
Now that biscuit has been deployed more widely, we have a clearer vision of future needs and missing features.
In particular, the datalog expressions have a few limitations that we can't overcome in a purely additive fashion:
[]
for sets is rather unfortunate #135Both of these changes can be mapped to new datalog primops, so existing blocks would still be evaluated the same way. The only thing that would change would be the parser. All datalog code that uses these features (
||
,&&
and[]
) would then generate datalog blocks with5
as version number. Everything else would still generate v3 or v4 blocks. Existing blocks would be evaluated in the same way.Since this requires breaking changes in the parser, it is a good opportunity to evaluate other ambitious additions that might also require syntactical breaking changes:
This is not to say that all these features should ship with the next version, but we should at least consider their impact on existing syntax and semantics to make sure shipping them won't require extra breaking changes.
On the matter of block versions
So far, version bumps have been fairly easy, as they were purely additive: evaluation semantics were not dependent on the block version. Rather, the block versions was derived from the block contents, and allowed implementations to reject blocks with unknown operations when deserializing protobuf.
The changes we are considering carry a semantic change that cannot be detected from the contents itself. They also carry parsing changes, even though in the past we have considered the textual representation out of scope of block versions, since it's more of a concern of interpreting code rather than tokens.
So we need a way to handle the semantics change properly (in descending order of importance):
A possible solution
Instead of changing the meaning of
And
andOr
operators, introduce newLazyAnd
andLazyOr
operators with non-strict semantics. This allows us to keep the existing version system (ie choose the lowest version number possible . So there would not be a breaking change in the datalog evaluation engine, but only in the parser, which is covered by the library version number, not by the token blocks version number. In addition to have&& ||
parse asLazyAnd LazyOr
, we could introduce a new textual representation for the strict And / Or (eg&&!
). This representation would not be necessarily parseable, but would allow new library versions to disambiguate.This would satisfy the 3 conditions listed above. A variant would be to not change the meaning of && and ||, but to introduce lazy versions (eg
&&~
), but I think that's too conservative and would make code harder to read.Note:
!
for strictness and~
from laziness is a convention lifted from haskell.