nim-lang / RFCs

A repository for your Nim proposals.
135 stars 26 forks source link

Commas for variable sections #320

Closed metagn closed 3 years ago

metagn commented 3 years ago

I know this is a pretty obvious feature to ask for but I don't know of any discussion on it. It's also very late to add this, and it would make sense that it doesn't click with a lot of people who are used to Nim. I'd still like to know what people think.

Problem: Indented let/var/const sections require extra vertical whitespace for the initial keyword, which is really ugly when you only need 2 or so variables. Using the keyword on each line for separate statements is also ugly because we are already conditioned to look for indented versions, and because it becomes noise in a language where for can be the first keyword of a statement.

Proposal: Allow a comma as a separator for variable declarations.

let
  a = x
  b = y

# turns into

let a = x, b = y

let
  c, d = z
  e = t

# turns into

let c, d = z, e = t # possibly slightly ugly

# maybe support the following too?

let
  a = x, b = y
  c = z

# or at least this:

let a = x,
  b = y

I'll admit it looks a little stupid here but when you compare the existing alternatives they're not great. Real example from sugar module:

var
  params = @[ident"auto"]
  name = newEmptyNode()
  kind = nnkLambda
  pragma = newEmptyNode()
  p = p

could turn into:

var params = @[ident"auto"], name = newEmptyNode(),
  kind = nnkLambda, pragma = newEmptyNode(), p = p

# or

var
  params = @[ident"auto"], name = newEmptyNode(),
  kind = nnkLambda, pragma = newEmptyNode(), p = p

# or

var
  params = @[ident"auto"], name = newEmptyNode()
  kind = nnkLambda, pragma = newEmptyNode()
  p = p

The point is you have the option to be clearer or not. import and enum similarly support allowing both commas and whitespace as separators, and routine arguments similarly support identdefs separated with commas.

Backwards compatibility: If this syntax was supported, this feature would break previous code using it:

let a = foo 1, 2 # meaning foo(1, 2)

but this already currently fails to parse.

disruptek commented 3 years ago

I don't care how hard it is for the compiler to parse code.

What I care about is how easy it is for me to parse code.

I find var and let sections easier to parse when the LHS of the assignment shares the same column.

I've taken to following the single-assignment-per-section style in many cases, simply because that's the style we use in the compiler, but I don't prefer it.

To make a change this significant, I think you have to solve a real problem. What is that problem?

Araq commented 3 years ago

I prefer to repeat the let, var keywords and think the "section" aspect is a silly unnecessary "consistency". I do like const sections though.

bluenote10 commented 3 years ago

Personally, I often have problems reading this style in JavaScript. My brain often blinds out the first variable due to the different indentation. It also drives me mad in JavaScript when people butcher perfectly symmetric syntax into:

let neighborNext = getNeighbor(some, common, args, "next"),
  neighborPrev = getNeighbor(some, common, args, "prev")

The human perception can spot commonality/differences faster when syntactically similar lines are aligned.

Also having the trailing comma only on one line makes it more awkward (= more editing needed) to refactor the order of these two lines.

For these reasons I personally prefer that Nim is actually enforcing aligned variable definitions.

mratsim commented 3 years ago

I find that very error prone especially this one:

let
  c, d = z
  e = t

Is c initialized to z? Is c default initialized? That's extra confusion and thinking time that we can do without. In fact, a maintainable codebase would likely require a nimpretty rule to remove those commas. And I'm pretty sure most C/C++ styleguides disallow this.

Concrete examples in C++ in https://github.com/mratsim/weave/blob/master/demos/raytracing/smallpt.cpp

double t, eps=1e-4, b=op.dot(r.d), det=b*b-op.dot(op)+rad*rad;

double n=sizeof(spheres)/sizeof(Sphere), d, inf=t=1e20;

Vec x=r.o+r.d*t, n=(x-obj.p).norm(), nl=n.dot(r.d)<0?n:n*-1, f=obj.c;

double r1=2*M_PI*erand48(Xi), r2=erand48(Xi), r2s=sqrt(r2);

double nc=1, nt=1.5, nnt=into?nc/nt:nt/nc, ddn=r.d.dot(nl), cos2t;

Vec cx=Vec(w*.5135/h), cy=(cx%cam.d).norm()*.5135, r, *c=new Vec[w*h];

2nd line, do you notice the d with no equal sign, is initialized? To what value? 5th line, do you notice cos2t with no equal sign, is initialized? To what value? 6th line, do you notice r with no equal sign, is initialized? To what value?

Trying to port this to Nim made me lose lots of time due to trying to parse those packed declarations and shadow. on Discord suffered the same fate.

So I need a stronger reason because I actually don't see the ugliness argument.

metagn commented 3 years ago

Thanks to everyone for the input.

I find that very error prone especially this one:

You could propose to deprecate that feature from current Nim and create a warning for it. One problem is I think people use this syntax:

var
  a, b, c: int

over

var
  a: int
  b: int
  c: int

Hopefully no one uses this but it is allowed:

proc foo(a, b: int = 1): int = a + b

echo foo() # 2

So maybe only warn/lint for the = part of it.

To others; to clarify, I don't think this has to be the standard way of every variable declaration, just in small cases where it would be preferred. I can't name every single one of these cases right now but I don't think they're very hard to imagine, though I recognize leaving it up to imagination isn't enough.

The truth is most JS style guides ban this. The airbnb style guide says the main reasons are , and ; being too similar (not really a problem for Nim) and that it's harder to add new variables (we still have full sections). But style guides for languages don't always reflect what's easiest to write at given moments, and Nim isn't like JS where you have to aggressively police your intuition, usually your intuition ends up being readable.

I also do recognize that this is a parser change and would be pretty significant for the amount of effort it requires. I don't have a response to "you're not solving any problems" because to me this is just an idea to be explored, if you think that constitutes a bad RFC then fine but I would appreciate knowing that before I make these.

timotheecour commented 3 years ago

this works and IMO is best:

let x = 1; let y = 2; var z = 3; var w: string
echo (x, y, z, w)

benefits:

note that type section have a real use case (recursion amongst types) but this isn't the case for var/let/const sections.

The only thing is making this an accepted idiom, in particular in code reviews.

c-blake commented 3 years ago

Any syntax feature has potential for abuse and while I agree with the preferences expressed by @Araq and @timotheecour, "style" is often personal, and that's ok. Style is only ever a partial constraint on syntax.

The strong argument here is consistency between let/var/const syntax and formal parameter syntax for procs, iterators, etc. People hate "different exceptions/gotchas in distinct but very similar contexts". Can you imagine if let/var/const had different syntax not just semantics? Just one sub-syntax for (type-inferred|explicitly typed, implicit|explicit initialized) identifier lists strikes me as maybe having compensating value in language simplicity/learning/remembering.

This problem framing "has legs". So, even though @hlaaftana's a, b: int = 1 being like a=1, b=1 e.g. is maybe a bit weird, it's best to retain that trait in every similar sub-syntax.

This also has some relation to this accepted RFC which would be a 3rd such list (or 5th if you count let/var/const separately, and maybe 6 with tuple[]s, too, if those get swept up in RFC252, and maybe some other context I'm neglecting at the moment.)

All this said, I'm not sure how back.compat/possible all this embedding the same/similar sub-syntax is. Often this sort of thing needs attention at the beginning, not 15 years in, but it's still worth considering if it can be made more consistent today.

c-blake commented 3 years ago

Oh wait, let/var/const i, j: int = 1 already works the same way as formal parameters. Anyway, consistency may matter for rfc252 anyway. Maybe my post was not utterly pointless. ;-) The relationship between this syntax & formal parameters must remain simple to explain.

c-blake commented 3 years ago

For example, proc f(i=1, j=2; a,b: int = 1) works now. An as yet unmentioned decl possibility is:

let: i=1, j=2; a,b: int = 1  # ':' creates proc-like syntax

where

let: # maybe ':' allowed but not required?
  i = 1, j = 2
  a, b: int = 1

was also made to work. Previously invalid naked a,b: int = 1 "statements" would become valid only in a declaration context.

The correspondence rule is just "if you want ident decls to be like proc param decls use a':'." The ':' ... ';' fits with other one-liner styles. { Another possibility is introducing an inline multi-let block with an empty let statement as in let; i=1, j=2, but that seems probably more confusing. }.

No idea how hard it would be to add this to the parser, but folks probably (do not|should not|cannot) have templates/macros named let/var/const. So, it seems almost backward compatible - without the : everything is as it was. With :, you get boosted capability.

Personally, I think this is all kinda potentially confusing. I would be unlikely to use it. Please do not mistake idea discussion for direct advocacy. As mentioned, the time for this may be past, but we are also expanding syntax for where = initializers can go.

c-blake commented 3 years ago

Note: if destructuring tuple let, i.e. let (a,b)=.. did not exist let (just-like-proc) might be an even better candidate. Even parens would match proc param list decls. Maybe that could also still work?

metagn commented 3 years ago

I don't think it's that important that we have symmetry between proc arguments and variables. Proc arguments are really the outlier compared to other "list" syntaxes because they require parentheses and need to be at a specific place specifically in a proc declaration etc. If any "section" was going to benefit from symmetry to proc arguments it would be using.

Type sections in general are the most distant and asymmetric part of Nim's syntax, I don't think they're much good for comparison either (specifically object lists wouldn't benefit from multiple fields in 1 line, though a, b, c: int syntax is still common).

Another thing about proc arguments is that you are forced to use commas. The only other list syntaxes that force commas are import etc and mixin/bind. enum is unique in that it does not distinguish between commas and newlines as separators (I thought import was the same but it's not) and allows both. You can pick ugly combinations of commas and newlines to separate enum members, but the distinction can occasionally help group enum members together.

I've never heard anyone complain about enum separators being ugly personally (= shenanigans not included). Maybe because most people don't know newlines work, but I'm sure some people do know. I also don't think anyone would have complained if commas were allowed in variable sections from the start. It comes down to how people use it and how much people tend to not use the worst option. Hell, you can even do this:

type Foo = enum a, b, c,
  d, e, f

While enum does not use identdefs syntax, it is pretty close and I am proposing the exact same kind of "commas and newlines are both separators" aspect. Though, while the = syntax can make commas look worse in enums (the main similarity to identdefs):

type Foo = enum
  a = 1, b = 2
  c = 3

it does not necessarily make them worse in variable sections:

var
  shadowed1 = shadowed1, shadowed2 = shadowed2
  actualVariable = 0

(To be fair var x = x isn't amazing syntax, you could easily replace it with mutableCopy x for clarity or something, so it could become mutableCopy shadowed1, shadowed2)

Here I would also like to mention tuple unpacking, it is commonly used in Python to declare multiple variables in 1 line like a, b = 1, 2 but it is not encouraged. The code above in Nim would become:

var
  (shadowed1, shadowed2) = (shadowed1, shadowed2)
  actualVariable = 0

Because Nim can specifically optimize tuple unpacked tuple literals like this by breaking them up into multiple assignments, I don't think it's worth not just breaking them up yourself. Tuple unpacking is supposed to unpack actual tuple values. Tuple unpacking a tuple literal also doesn't allow variables to depend on each other, which can help if you are performing a swap or something, but we have a swap proc that is more efficient and probably more better subtitutes for other desired behaviors. This:

let (cond, val) = if someCond(): (true, val1) else: (false, val2)

becomes:

let cond = someCond(), val = if cond: val1 else: val2

I personally hate let a = 1; let b = 2 because you never see ; in open code otherwise. If you're willing to write the extra let keyword (I know this is a petty complaint, I don't really care that much, just for the sake of the argument) then you might as well just break it up into multiple lines.

disruptek commented 3 years ago

Again, I have to point out that the major concerns here are:

This is essentially the same feature and the same rationale for supporting #321.

If you don't agree, you aren't reading enough code or writing enough macros. :grin:

metagn commented 3 years ago

Again, I have to point out that the major concerns here are:

* clarity for the reader

* support for the metaprogrammer

This is essentially the same feature and the same rationale for supporting #321.

If you don't agree, you aren't reading enough code or writing enough macros. 😁

This is one of the most cryptic responses I've read, but I'll try my best.

No idea what "clarity for the reader" means. Do you mean like the font you are using can be a problem? The point of this feature is that you would get rid of excessive obviousness, it comes down to the user what they want to contract.

No one mentioned metaprogramming, I don't see how you're repeating a major issue. I suppose like enums, there would be no difference in the AST representation whether you use commas or newlines. I don't see how this breaks anyone's macros though? Macro pragmas on variables wouldn't be affected.

Last 2 sentences are just nonsense to me, really sorry.

I understand you think being condescending helps other people think or something, but in cases like these it really just seems like a snide, indirect, thoughtless way of expressing your disapproval. I don't really care to deal with that kind of approach, unfortunately keeping up RFCs like this mean doing exactly that. Even if you think this should still be an RFC that should be thoroughly argued, I don't think I could bear that responsibility, so closing