Allow defining multiple variables inside `let ... in`

tweag / nickel

Better configuration for less

https://nickel-lang.org/

MIT License

2.35k stars 89 forks source link

Allow defining multiple variables inside `let ... in` #1695

Open giorgiga opened 11 months ago

giorgiga commented 11 months ago

Is your feature request related to a problem? Please describe.

I think most people will agree that this:

let
  foo = 1
in let
  bar = 2
in let
  baz = 3
in
  foo + bar + baz

is far less readable than:

let
  foo = 1,
  bar = 2,
  baz = 3
in
  foo + bar + baz

(plus it gets worse as expressions become more complicated)

Describe the solution you'd like

Allow declaring more than one variable inside let ... in, for example separating them with commas.

Describe alternatives you've considered

If commas pose some issue with the grammar (or are disliked), another option could be let { foo = 1, bar = 2 } in ... (but in this case one might want to add metadata to the variables, which IIUC let doesn't allow)

Additional context

I can work on a PR if you think the idea is viable.

(well... I think I can: I'm not really a rustacean, but this seems relatively simple to tackle as a first issue)

yannham commented 11 months ago

Related: ~~#504~~ (edit: #494). As you can see in the original issue, I'm rather in favor of this change :stuck_out_tongue: but let's discuss that in the next weekly meeting, and in this issue, and see what comes out of it.

Another interesting aspect is that blocks let-binding can have performance impact. In a sequence, the environment is duplicated, then modified (insertion of the new binding), then duplicated, etc. between each and every binding, because the interpreter has to assume that in let x = 1 in let y = _exp_ in body, _exp_ might depend on x (it could do additional analysis to prevent that but it's not free).

If _exp_ doesn't in fact depends on x, this introduces a spurious dependency, which makes the environment less efficient (introducing sharing points by cloning after inserting turns the environment into something like a linked list instead of a hashmap, at least locally).

This might also inhibit reuse optimizations (currently, if you map on an array, and this array is not referenced by anyone else, you can map in place - but with spurious dependencies you might capture references through those let-bindings and unduly prevent reuse).

I honestly don't know if the impact is noticeable, especially now that we've got rid of the share normal form transformation (#1647) which was a huge source of sequence of independent bindings, but it's still an interesting semantic distinction: a let block express a form of independence and parallelism, a guarantee that might be used by the interpreter, while normal let-binding must be assumed to be sequential and dependent.

giorgiga commented 11 months ago

That's great @yannham

Another option just came to mind: one could simply drop the "intermediate" ins or replace them with commas (let a = 1 let b = 2 in a + b or let a = 1, let b = 2 in a + b)

For reference, let me name the ideas so far (and format them in a more concise way) - I'll follow up shortly with some additional notes.

1. strict let..in (the current syntax):

   let foo = 1
in let bar = 2
in let baz = 3
in     foo + bar + baz

2. let block:

let  foo = 1,
     bar = 2,
     baz = 3
 in foo + bar + baz

3. let list with commas:

let foo = 1,
let bar = 2,
let baz = 3
 in foo + bar + baz

4. let list with no commas:

let foo = 1
let bar = 2
let baz = 3
 in foo + bar + baz

giorgiga commented 10 months ago

The TLDR is "I think the let block option is the winner"; below is the long version, but it's probably not as interesting as to justify its length (since I've written it, I'll post it anyway in case anyone is curious).

Long version

For the sake of exploring how things may end up looking like, below I'll pretend nickel has added haskell-like `where` and nix-like `with`. Please don't take this as me assuming the language will evolve in any specific way (or "at all"): it's just for brainstorming. The examples below will not include expressions where there is no `let` (eg. nix's classic `with pkgs; [ pkg1 pkg2 ]`)... this is not in-topic, but let me just note here that, in case one wants some sort of separator between `with` and the value expression (because `with pkgs [ pkg1 pkg2 ]` is difficult to parse, visually or programmatically), an option would be to exchange `with` with some imperative verbal form (say, `use`) and recycle `let`'s `in` as a separator (ie: `use pkgs in [ pkg1 pkg2 ]`). Other words may also be exchanged for `where`, of course (for example `with` itself or -ing verbal form like `using` or `letting`). Formatting will probably look unusual (it's in the style I use for SQL queries)... since we are talking about language features that don't exist, it's not like there's an established "idiomatic" style, so everyone will have to re-format according to its own style to get an idea for how things could look. Also, I will not go into what is easy to write a parser for, in part because I don't know the capabilities of that lalrpop parser generator you are using, and in part because I feel technical compromises are best left for later. ### 1. *strict let ... in* (the current syntax): ``` with std.number with std.string as str let foo = 1 in let bar = 2 in (min foo bar) + (max baz qux) where baz = 3 where qux = 4 ``` I find this is verbose and distracting... most of all, the "main" point (the actual expression `(min foo bar) + (max baz qux)`) does not stand out from the boilerplate. One-liners (eg. `let a = 1 in let b = 2 in a + b`) are not particularly easy to visually parse: when you see the first `let` your eye looks for the matching `in` and only when you realize it's an `in let` you iterate until you get to the "main" expression (it may be that that's just how *I* read things: IDK if people normally make sense of expressions top-down like I do or bottom-up... at least someone else does it like me, otherwise haskell would not have its `where`). ### 2. **let block**: ``` with std.number, std.string as str let foo = 1, bar = 2 in (min foo bar) + (max baz qux) where baz = 3, qux = 4 ``` This neatly divides the expression into "blocks" (similar to how SQL statements are structured). The "main" expression still doesn't stand out very much (IMHO the `bar = 2` stands up more, and you'll have to put each `let` definition on its own line with non-trivial expressions)... however, it's easy enough to find (you just have to look for the `in`). Commas should be allowed after all lines (eg. after `std.string as str` , `bar = 2` and possibly also `in (mion foo bar) + (max baz quz)`). ### 3. **let list**: ``` with std.number with std.string as str let foo = 1 let bar = 2 let baz = 3 in (min foo bar) + (max baz qux) where baz = 3 where qux = 4 ``` The one above is not the best example for this, but this form (like the next one) works well if there are other "statements" that can be intermixed with `let` variable definitions (none of them comes to mind, so I didn't change the example). This style is reminiscent of imperative languages and so it may prove more familiar/intuitive to those programmers (the majority?) who have only been exposed to imperative languages. ### 4. let list with commas: ``` with std.number, with std.string as str, let foo = 1, let bar = 2, let baz = 3 in (min foo bar) + (max baz qux) where baz = 3 where qux = 4 ``` This is a variant of the previous one so it shares most of its pros/cons. Personally, I find that the comma helps in visual parsing, but it also looks a bit strange here, as it kind of wants to be seen as an imperative-style statement terminator in an expression language.

jneem commented 10 months ago

:+1: for "let blocks". I like to keep the fact that there is one "let" for every "in"

giorgiga commented 10 months ago

@yannham about my offer to submit a PR... it's gonna take a while (and you'll probably be better off having someone who is already familiar with the language/tooling/project does this) :(

From what I've seen in the code, implementing this is gonna require quite extensive changes and doing that while also learning rust+lalrpop, maintaining the current (half?) implementation of rec, and the (undocumented?) let patterns is bound to require much more effort that I anticipated... which also means several other tasks are bound to end up before this one in my TODO list.

thufschmitt commented 10 months ago

:+1: also for “let blocks”. It also brings a nice syntactic symmetry between let blocks and records (Nix has that, and I find it quite practical when refactoring code as changing a record to a set of let bindings (or the other way around) is somewhat frequent.

For symmetry’s sake, I'd also weakly suggest making these blocks recursive by default (like records are)

yannham commented 10 months ago

For symmetry’s sake, I'd also weakly suggest making these blocks recursive by default (like records are)

Ah, there has been some discussion about this in the original issue around that, see https://github.com/tweag/nickel/issues/494#issuecomment-992715769 for example. Although it's a bit less consistent syntax wise, I think recursion by default is the right choice (i.e. the most common / natural thing to do) for records, but isn't for lets in general (and single lets aren't recursive by default, as mentioned in the comment, which would be a surprising difference).

IMHO the less evil/surprising inconsistency is to keep lets, including let-block, not recursive by default, and record be recursive by default. They have a different semantics and express a different intention after all, even if the syntax is similar. I rarely write recursive lets (including in other functional languages), but I reckon we often use recursion inside Nickel records. As always I'm happy to change my mind if there are strong arguments in the other direction!

toastal commented 9 months ago

What’s wrong with

let foo = 1 in
let bar = 2 in
let baz = 3 in
foo + bar + baz

Even in a more verbose case nothing wrong IMO with

let foo = 1 in
let bar = {
    qux = 1,
} 
in # could cuddle this up a line?
let baz = [
    0,
    1,
    2,
] 
in
foo + bar + baz

When you put in in the front and try to align everything all weird-like, of course it looks strange & verbose. There seems to be a lot of focus on alignment & symmetry that aren’t really useful if not aiming for that specific style as a goal. let … in for each new variable seem to be just fine for the Dhall & OCaml projects…

yannham commented 9 months ago

What’s wrong with

Well, first, nothing is wrong with that :slightly_smiling_face: note that it only works for one-line definitions, though. If your definition is too long to fit on one line, then nickel format will rather lay it out as in the original example of this issue.

Beside, the discussion revolves around the following points:

aesthetics: some people find it more pleasant/more readable to declare variables as:
```
let
foo = 1,
bar = 2,
baz = 3,
in
```
than
```
let foo = 1 in
let bar = 2 in
let baz = 3 in
```
semantics: in a let-block, bindings would be guaranteed to be independent (which is, most of the time, what you want), while they are not in let-block chains. That is, the behavior of
```
let x = 1 in
let y = x + 1 in
...
```
is not the same as
```
let
x = 1,
y = x + 1,
in
...
```
Although it's hard to say if it's noticeable, this might in fact have performance implications. Take let foo = <foo_def> in let bar = <bar_def> in <continuation>, where bar doesn't actually depend on foo. Currently, no analysis is performed to tell if <bar_def> uses foo or not, and because it might very well do so, <foo_def> and <bar_def> don't share the same environment (<bar_def> has the definition of foo added in it, and the <continuation> has both). Environment insertions have an impact on the structure of the environment, and thus on its performance (currently a long sequence of let-binding introduces an environment that looks like more or less a linked list, locally).

On the other hand, in a let-block version let foo = <foo_def>, bar = <bar_def>, in <continuation>, 1. all definitions share the same original environment and 2. the <continuation> sees all the definitions at once in an environment layer, which is locally like a hashmap rather than a linked list.

This is to be taken with a grain of salt, because some simple free variable analysis phase could, in the future, detects when there's no syntactic dependency between two following lets, but it's still additional machinery. The data structure implementing the environment might change as well. But, in general, insertions/modifications on shared data will always have a potential cost for a persistent hashmap.
there's currently no direct way to declare a bunch of mutually recursive functions outside of a record. You can still do let {f, g} = {f = ..., g = ...}, which is an OK workaround (and, honestly, I haven't had the need for local mutually recursive definitions like this until now in Nickel). I still mention this for the sake of exhaustiveness.

yannham commented 9 months ago

let … in for each new variable seem to be just fine for the Dhall & OCaml projects…

By the way, Dhall does have actually a simple form of let block (when they repeat the let but not the in: https://github.com/dhall-lang/dhall-lang/pull/266 ). As do Nix and Elm. And Haskell. In fact OCaml has a let .. and .. construct that has the exact same semantics as let-blocks described in this issue, but for some reason, it's not used much outside of mutually recursive definitions.