crystal-lang / crystal

The Crystal Programming Language
https://crystal-lang.org
Apache License 2.0
19.22k stars 1.61k forks source link

Parser failure on argument list with trailing comma #14616

Open straight-shoota opened 1 month ago

straight-shoota commented 1 month ago

This code parses correctly:

bar(1,
)

The following do not, with different errors (the comments are just for illustration, they are not substantial to reproduce the error).

bar(1
, # Error: expecting token ')', not ','
)
bar(,
)# Error: unterminated call

Extracted from the formatter bug issue #14609.

straight-shoota commented 1 month ago

The first error (foo(1\n,\n)) was actually introduced explicitly in #6514. I don't think I agree on the reasoning. The syntax is unambiguous and easy to parse. I don't see a good reason to reject it. Of course it's not well formed, but that's what we have the formatter to take care of.

straight-shoota commented 1 month ago

The second error can be simplified to (foo(,)), so the newline is insignificant. It needs some clarification on if and how to fix it. I've extracted that to #14629.

zw963 commented 1 month ago

@straight-shoota , this issue not planned in the next release?

Anyway, this formatter issue will cause syntax error, it should be fix ASAP.

straight-shoota commented 1 month ago

6514 made this a parser error. Before that, bar(1\n,) was considered valid code. The discussion was initiated by a bug in the formatter which couldn't format this.

That prompted an investigation into the syntax of newlines before a comma (aka leading comma). The result was that some languages (including Ruby) don't allow that, while others do. So it was seen as a matter of preference, and rather felt ugly.

Another argument was that this wouldn't work at all for call notation without parenthesis and it would be preferred if both styles are more similar. And another point was that it simplifies the syntax of the language a little bit.

Finally, other syntax features with commas such as array literals didn't allow this style either.

The PR was very well received, although there were also two downvotes. Yet no critical comment was posted.

straight-shoota commented 1 month ago

I wouldn't mind going the other way an enable this feature. The syntax is completely unambiquous and the difference is only about the for of white space, which usually shouldn't have much influence on the meaning of a program. As long as it's consistent between array literals and call arguments, it could as well be allowed. Stylistic preference would be enacted by the formatter, which can easily turn leading commas into trailing commas. IMO the argument about non-parenthesized args doesn't have much weight because it's a different style and limited in other ways as well.

However, I'm also fine with keeping it as is. There hasn't been any complaint about this until now, and this only came from a misbehaviour of the formatter which introduced this syntax. We need to fix that formatter bug, of course.

zw963 commented 1 month ago

In fact, i am okay with both case, the minimum change is preferred? consistent usage is always better.