nim-lang / RFCs

A repository for your Nim proposals.
135 stars 26 forks source link

Make assignment/arrow operators right associative, change precedence of -> #467

Open metagn opened 1 year ago

metagn commented 1 year ago

Abstract

Make the assignment and arrow like operators as described in the manual have right associativity. To make this work, also increase the precedence of ->, making it no longer an "arrow operator".

Motivation

Right associativity is pretty self explanatory. Left hand sides of assignment operators are meant to be addresses, not other assignment operator expressions. This is why in almost every language, assignment operators are right associative. But it's not that noticeable in Nim since most assignment operators are statements. For arrow operators, a => b => c evaluates to (a => b) => c which is also meaningless, hence lambda operators are also right associative in most languages.

Arrow operators, which the manual lists as having a lower precedence than assignment operators, have actually had equal precedence with assignment operators since their implementation, maybe as a mistake. This means expressions like a => b += c parse as (a => b) += c which is also, meaningless. However, lowering arrow operators' precedence is not needed to alleviate this, making these operators right associative would do the same thing.

Arrow operators were, however, originally right associative. This was changed in order to make a -> b => c parse as (a -> b) => c. There is another alternative fix here: Increase the precedence of ->. It doesn't really make that much sense either, for ->, having simple type operands, to have the same precedence with =>, which has a general statement operand on the right hand side.

-> also being right associative is intuitive, since a -> b -> c meaning a -> (b -> c) is a common pattern in functional languages and the left hand side usually has parentheses anyway due to multiple parameters.

Description

-> in some other languages, aside from the function type meaning, occasionally can also mean "dictionary entry", "pairing", or "range" (sometimes => is this way and -> is like Nim's =>). This generally has a higher precedence.

The precedence for -> should probably be above 1 (assignment). Beyond that, it's a matter of whether or not stuff like int or float -> string should mean (int or float) -> string or int or (float -> string). I think 2 is best, it seems to be more semantically similar to those operators than the others to me.

So far specifically -> has been mentioned but it would be simplest to make all arrow operators starting with - have the higher precedence. I haven't seen such operators used too often but I'm guessing they generally wouldn't lose from the precedence change. There is also the ~> variant but again, I haven't seen its uses.

There is also the idea of arrows and assignments going left like <= and =+ which would have inverted associativity but this just adds another way to do the same thing.

Not sure if there is any particular reason for assignments not being right-associative. I can see it just being due to simplicity of the associativity calculation since expression assignments are not too common. This is the biggest downside of the RFC, is the performance implications from the extra computations for associativity + separate checks for -> arrows vs other arrows.

Code Examples

import sugar
template `:=`(a, b): untyped =
  a = b
  a

var foo: int -> int -> string
discard foo := (a: int) => (b: int) -> string => $(a + b)
assert foo(2)(3) == "5"

var count = 1
var lastIncr = 0
let counter = () => count += lastIncr := 1
counter()
assert count == 2
assert lastIncr == 1
counter()
assert count == 3
assert lastIncr == 1

Backwards Compatibility

Since there is basically no genuine use of the involved operators without this behavior, most code uses parentheses here, and isn't affected. Could be wrong. New code using it though will not work with older versions.

Araq commented 1 year ago

IMO right associativity is always unintuitive since we read text from left to right. And it's true for all the examples you bring up. I understand that a -> b -> c parsing as a -> (b -> c) is what FP languages do and why they do it, but it's subtle for newcomers and ultimately might indicate poor syntax to begin with.

We should look into removing more right associativity, not embrace it.

mratsim commented 1 year ago

If you asked me when I discovered Nim 5 years ago i would have say yes, it was one of my main issues and the often listed solution, to use a ^ prefix was really quite cumbersome, especially on European keyboards, fiddling with dead keys and what not. (I was trying to chain FP transformation: https://github.com/mratsim/nim-project-euler/blob/master/src/pe002_even_fibonacci_numbers.nim#L15-L16

Now I'm not too sure due to potential breakage.

IMO right associativity is always unintuitive since we read text from left to right.

Devs read intuitively foo(bar(baz(x)))

Araq commented 1 year ago

Devs read intuitively foo(bar(baz(x)))

Good point but actually even that syntax is suboptimal. First x is evaluated then baz(x) etc. So a non-backwards syntax would be x -> baz -> bar -> foo. (Which is (((x -> baz) -> bar) -> foo) so this arrow in particular is not right associative.)

This might also be the reason why people prefer to split up "deeply nested" expressions. -- Split up into what exactly? Into "statements". What is the difference between these two? Statements are actually run "in order".

metagn commented 1 year ago

it's subtle for newcomers

I've been looking up operator precedences from the manual since day 1 of using Nim (really any language). With this logic precedence is bad too because it's arbitrary past PEMDAS since kids memorize it from primary school.

The difference is that the parser validates people writing a and b or c and punishes people who want to write chained => by forcing them to use parentheses, the case when parentheses aren't used is basically always invalid.

There are actually many "implicit" operators that the parser does not treat as such but the human eye can be tricked into thinking it is. For example:

let foo = proc (a: int): auto = proc (b: int): int = a + b

Despite there being only 1 instance of an operator call in this code (+), 2 other "fake" operators are used: : and =, both having multiple meanings (respectively argument type and return type (->), variable assignment and proc body). To a clueless person, in this example, : (meaning ->) has a higher precedence than =, and = is right associative. If you are using these statements/expressions, you are agreeing to the intuition of a right associative = operator.

If that's not a good argument for associativity, then it should at least be a good argument for ->'s precedence. I made this a joint proposal because I didn't want to make the associativity proposal depend on another proposal which is the precedence proposal. But I don't think the precedence proposal depends on anything, and can be implemented by itself. The issue is the same argument applies to it: The parser and thus the grammar gains "complexity" because now there are 2 arrow-like operators.

Sidenote:

x -> baz -> bar -> foo

This is also an operator taken from functional languages, |> (not that this isn't obvious knowledge, just pointing it out). In Nim this is implemented in the core language as .. . having properties and function names on the right hand side is no different from assignments having the addresses on the left hand side: it justifies their associativity. If you don't like that it's an operator rather than a list of statements, you can add a sugar.chain which is sugar for a.chain(b, c, d) => a.with(b).with(c).with(d).

Araq commented 1 year ago

I don't understand your way of arguing, the examples you bring up are indeed all questionable style that would be better off using () for grouping.

And yes, exactly. The current rules are already a bit too complex for my taste and it feels like you're making them worse. But even if not and we would agree on these improving the situation objectively there is also the argument that it breaks backwards compatibility. And while you can argue reasonably that no code in reality will be affected in the past that is exactly what happened for other very comparable changes.

However, you can implement your preferred solution in a PR and then we can see how many important packages would break. As a starting point.