JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.63k stars 5.48k forks source link

Non-transitive operator precedence #18714

Open martinholters opened 8 years ago

martinholters commented 8 years ago

Ref: https://github.com/JuliaLang/julia/issues/5187#issuecomment-237129706 and the following discussion

I couldn't find a specific issue for this, but it seems worthwhile to have one.

Conceptually, I imagine this to work by having a (non-total) mapping from (ordered) pairs of operators to the set {Left, Right}, which indicates the operator of higher precedence. E.g.

(-, -) -> Left
(^, ^) -> Right

ensures that - is left-associative while ^ is right-associative and

(=, &&) -> Right
(&&, =) -> Right

would allow that a = b && c parses as a = (b && c) while a && b = c parses as a && (b = c). If no mapping is defined for a combination, the result is a parse error, requesting explicit parentheses.

stevengj commented 8 years ago

I think that would just make the precedence rules even more confusing; is there any extant language that has ever defined precedence rules in this way?

martinholters commented 8 years ago

I think that would just make the precedence rules even more confusing

It certainly has the potential to do so when abused. But on the other hand, it would also allow cases that are confusing at the moment to be undefined and require explicit parentheses. Then, instead of unexpected behavior, one gets an error right away. Working out a set of rules that are generally perceived as least confusing would certainly be no easy feat, though.

stevengj commented 8 years ago

The problem with implementing complicated DWIM parser rules is that no one can remember exactly what they are, so you end up having to add explicit parentheses anyway if you want to be sure of what your code is doing.

jiahao commented 8 years ago

Relevant LtU thread on alternatives to operator precedence rankings.

Two language design choices of note:

  1. As mentioned in the issue linked, Fortress does not have absolute operator precedence, only relative ones. There is no total order on operator precedence. This reference (SpringerLink): doi:10.1007/978-0-387-09766-4_190 provides the following examples (p. 732, d.p. 88):

    a + b > c + d ⊛ Correct: + has higher precedence than >
    p > q ∧ r > s ⊛ Correct: > has higher precedence than ∧
    w + x ∧ y + z ⊛ Incorrect: no defined precedence between + and ∧
    (w + x)∧(y + z) ⊛ Correct: parentheses specify desired grouping
    w + (x ∧ y) + z ⊛ Correct: parentheses specify desired grouping

    Personally, I find the combination of relative operator precedence and whitespace sensitivity strange, albeit well-intentioned. To reproduce the example given in this LtU comment:

    • OK: 2+3*5, 2 + 3 * 5, 2 + 3*5
    • Compile error: 2+3 * 5
  2. APL and its descendent languages (J, K, etc.) do not have any operator precedence whatsoever. Evaluation is always from far right to left for right hand side arguments (the ⍵ argument), and to near left for left hand side arguments (the ⍺ argument). The disadvantage of this rule is that it can be hard to read all the way to the rightmost expression, like in this implementation of Quicksort:

    Q←{1≥≢⍵:⍵ ⋄ S←{⍺⌿⍨⍺ ⍺⍺ ⍵} ⋄ ⍵((∇<S)⍪=S⍪(∇>S))⍵⌷⍨?≢⍵}
StefanKarpinski commented 8 years ago

There are two separate issues to be discussed here.

  1. Only allowing a subset of the precedence differences to be used without explicit parentheses. The precedence between + and * are well-established and can be trusted to be used implicitly. The precedence between < and & are notoriously problematic – people write a < b & c < d and are surprised when it parses as a < (b & c) < d instead of (a < b) & (c < d). We could make the version without any parens a warning or an error.
  2. Having non-linearizable precedence: allow cases where a op1 b op2 c parses as (a op1 b) op2 c and a op2 b op3 c parses as (a op2 b) op3 c but a op1 b op3 c parses as a op1 (b op3 c).

I think the former is a good idea since it would immediately prevent common bugs and errors. The latter seems far more questionable and a really convincing case would need to be made for it.

ararslan commented 8 years ago

A weak +1 to Stefan's first point (weak because it's a good idea and could encourage better practices but it may be annoying for people who know exactly what they're doing), and a strong -1 to the second. I think that has the potential to cause far more confusion than it would solve.

musm commented 8 years ago

There may be important theoretical arguments against non-transitive operator precedence, but IMO I find it unwieldy to have to write

issome(x) && (return x)
iscond(x) && (y = x)
anothercond(x) && (return throw())

I feel like that should just work, since it's so unambiguous, regardless or raised theoretical arguments against.

ararslan commented 8 years ago

You don't need parentheses around return like that (also you shouldn't return throw, just throw, which also doesn't need parentheses). I don't see an issue with parentheses around assignment in that location though, especially since it's clearer to write assignments in other ways.

martinholters commented 8 years ago

I see that the proposed mechanism might be badly abused. I'm with @StefanKarpinski in https://github.com/JuliaLang/julia/issues/18714#issuecomment-250201041 here, making ambiguous cases an error is IMHO beneficial, adding fancier rules should be done with great care, if at all. Allowing

has_no_proper_value(x) && x = default_value

looks relatively innocent, but even here, I wouldn't rush things. I just thought that if precedence undergoes a revision, we might just as well do something which would be very flexible, even if we don't exploit all that flexibility immediately.

BTW, why does // have higher precedence than / and *? I guess there is a good reason, but 1//1//2 != 1/1//2 seems a little weird. Maybe this could be better solved with an advanced precedence scheme?

stevengj commented 8 years ago

I think the intention with // is that p//q would be treated much like a numeric literal.

JeffBezanson commented 8 years ago

I think the && idiom might be better handled as

has_no_proper_value(x) ? x = default_value

which already has the right precedence for that. There is an issue for this somewhere.

ararslan commented 8 years ago

@JeffBezanson The syntax you mentioned would require a change to how ternaries are parsed, wouldn't it? Since it's a ternary with the else portion dropped? Re issue, I think that's #6823.

andyferris commented 8 years ago

Yes the if ... then ... one liner would remove some confusion with the common uses of &&. Turning the ternery into a binary sounds cool.