Open runarorama opened 2 years ago
One thing that's super cool about Unison is that we could implement 1
for now and swap it out for 2
in the future. This is purely a syntactic construct only relevant to the parser and pretty-printer; the underlying AST is what will be stored so we don't have to worry about changing the meaning of programs.
Option 1
could potentially lead to people choosing bizarre names just to get the precedence that they want. But I think that it's actually a little less overwhelming to both library authors and consumers. Library authors aren't faced with an upfront "hmm what's the right precedence value?". And library consumers probably get a feel for the precedence rules that's consistent across libraries (even if they don't consciously realize).
My vote would probably be to go for 1
for now and reconsider 2
in the future if 1
feels like it's lacking.
here @atacratic brings up an interesting point about the potential downside of the pretty-printer having expert-level knowledge of the precedence rules. A simple option would be for the pretty-printer to always add parentheses for binary ops; but I don't know this might look really bad in practice. One of these days structural viewing/editing might help :)
Alternative proposal: just create special rules for +
, -
, *
, and /
to avoid issues with basic arithmetic (which newcomers often hit early on and makes Unison look silly). And come up with a fancier solution later if we feel the need.
Another option:
x + y * z
requires parens. x + y + z
or foo |> blah |> blah2
are fine and parse left associative.Basically, you don't need there to be a total order on all operators.
This could be combined with option 1, so the standard arithmetic and boolean operators could have expected precedence and could be mixed, but mixing other operations requires parens.
I definitely think Unison shouldn’t do the same as Haskell does – a fixed set of levels, with operators hardcoded at various points. I’ve personally felt a lot of pain with it.
Another way to approach it – relative precedence (this is not my idea, but I can’t recall where I’ve come across it). You can define the relationship between any two operators:
(^) `hasHigherPrecedenceThan` (*)
(*) `hasHigherPrecedenceThan` (+)
(-) `hasSamePrecedenceAs` (+)
(.) `hasHigherPrecedenceThan` (<=<)
(||) `hasSamePrecedenceAs` (&&)
(==) `hasHigherPrecedenceThan` (&&)
(!=) `hasHigherPrecedenceThan` (&&)
(+) `hasHigherPrecedenceThan` (==)
(+) `hasHigherPrecedenceThan` (!=)
which forms a partial order
graph RL;
A[+ / -]--> B[*];
B --> C[^];
D[<=<] --> E[.];
F["|| / &&"] --> G[==];
F --> H[!=];
G --> A;
H --> A;
This implies that x + y - z
is fine, but x == y != z
needs parens. A library that defines a new operator can do
(><) `hasHigherPrecedenceThan` (+)
(*) `hasHigherPrecedenceThan` (><)
resulting in it being injected at the right place without invalidating any other precedences
graph RL;
A[+ / -]--> B[><];
B--> C[*];
Any precedence rule that introduces a cycle given the rules that already exist in its dependencies is an error (this could cause an issue when you pull in two dependencies that have conflicting rules).
Another option:
3. It's a parse error to mix operators in the same expression without parens. So `x + y * z` requires parens. `x + y + z` or `foo |> blah |> blah2` are fine and parse left associative.
I think there are some interesting options in this vein.
+
and |>
examples above) – just allowing any a -> a -> a
operation to be treated associatively (and parse left-associative) can be misleading(↑) : a -> Vect a -> Vect a
, from the type you know that x ↑ y ↑ z
will only typecheck if it’s grouped x ↑ (y ↑ z)
[^1][^2]. data.List.+
is the same. But type variables can make this harder – Haskell’s (^^) :: (Fractional a, Integral b) => a -> b -> a
seems like it should group (x ^^ y) ^^ z
, but it actually groups x ^^ (y ^^ z)
and it only chains if the latter ^^
’s a
is Integral
and is thus the former ^^
’s b
.[^1]: I’ve done something like this before by using parse forests – if the typechecker failed, we tried other possible parses before giving up. It was neither performant nor comprehensible to users (but maybe some edit.debug
command that made all grouping explicit – i.e., moar parens)
[^2]: Although a linter might tell you to use (x * y) ↑ z
instead 😁
Any precedence rule that introduces a cycle given the rules that already exist in its dependencies is an error (this could cause an issue when you pull in two dependencies that have conflicting rules).
Alternatively, cycles could just invalidate the precedences in the cycle, resulting in a warning and disjoint precedence (requiring parens) between those operators.
Disjoint precedence could also be made first class, so that
(==) `hasHigherPrecedenceThan` (&&)
(!=) `hasHigherPrecedenceThan` (&&)
(+) `hasHigherPrecedenceThan` (==)
(+) `hasHigherPrecedenceThan` (!=)
from the earlier example could instead be
(==) `hasDisjointPrecedenceFrom` (!=)
(==) `hasHigherPrecedenceThan` (&&)
(+) `hasHigherPrecedenceThan` (==)
so that ==
and !=
are at the same level, but can’t be mixed without parens.
I'm assuming (almost) all the confusion arises out of simple arithmetic operators. 1.0 + 2.0 / 3.0
being evaluated from L to R rather than according to PEMDAS.
So if anything, it seems like there's a lot more value to implementing #1 where the arithmetic operators (plus ^
if it's aliased to Math.pow
and mayyyyybe) than doing #2. How often would someone be confused that a ^-!&&->~
on the left is not evaluated before ++--!+
on the right because L to R evaluation seemed wrong?
My preference: hard-coded PEMDAS (not quite what scala does regarding prefix/suffix... just a fixed list of symbols), everything else is left to right
This allows correct math rules, which everyone knows and is used to, and no surprises when reading unfamiliar code.
To mitigate my previous comments a bit – I’m in favor of the “hard-code PEMDAS, leave the remaining behavior as-is” approach. Would definitely be an immediate improvement.
The fact that all binary operators associate to the left is a source of surprise for programmers (and probably bugs).
We should do something about this. Options include:
The latter is of course much more robust, but the former is by far the easier to implement.