Closed ahelwer closed 1 month ago
Here's an additional ambiguity, involving the --
infix operator:
---- MODULE Test ----
EXTENDS Integers
a -- b == a - b
op == 1 ----(1, 2)
=====================
The --
infix operator is used in infix form, followed without spaces by the --
in nonfix form. Both SANY and the tree-sitter grammar parse this as a hyphen line. I propose this behavior be declared correct and remain unchanged.
This has been completed since it's the default way everything work as it currently is, and this RFC itself can serve as documentation of these decisions (although I think there should be a more official way to record RFCs, especially completed ones, than github issues).
At this point there are three known ambiguities in the TLA+ grammar, where ambiguity is defined as "syntax requiring unreasonable amounts of lookahead to disambiguate". This proposal hopes to resolve these ambiguities in favor of keeping parsing simple, which has the added benefit of not requiring any changes to the current working of the tree-sitter grammar or SANY beyond possibly improved error messages. Philosophically, we might ask whether we want to add complexity to the language specification by defining these as special cases of general rules, or add complexity to TLA+ parsing by requiring these cases be handled according to a straightforward reading of the language spec.
The ambiguities are as follows:
/\
or\/
in nonfix form, as in/\(a, b)
(see https://github.com/tlaplus/tlaplus/issues/637 and https://github.com/tlaplus-community/tree-sitter-tlaplus/issues/4)(+)
,(-)
, and(/)
which conflict with calling an operator with higher-order parameters as inf(+)
,f(-)
, andf(/)
(see https://github.com/tlaplus/tlaplus/issues/625 and https://github.com/tlaplus-community/tree-sitter-tlaplus/issues/5)(*
as a valid sequence of characters in the language, as inf(*)
or(*(a, b))
(see https://github.com/tlaplus/tlaplus/issues/626 and https://github.com/tlaplus-community/tree-sitter-tlaplus/issues/6)The proposal to disambiguate these is as follows:
/\
and\/
in nonfix form; users can use\land(a, b)
or\lor(a, b)
in ASCII implementations of TLA+ if they really wish to use these operators in nonfix form.(+)
,(-)
, or(/)
as an infix operator symbol; users can use the+
,-
, and/
operators as higher-order parameters by surrounding them with spaces as inf( + )
,f( * )
, andf( / )
.f( * )
and( *(a, b))
.The justification for the proposals is as follows:
/\
and\/
are defined in TLA+ builtins so cannot be redefined in other modules, there is no real use for this feature. Allowing their use in nonfix form would considerably complicate the already-complicated logic for parsing conjunction & disjunction lists.Accepting this proposal will close all issues linked above.