Closed TrevorBurnham closed 13 years ago
Nice writeup, Trevor. I was having trouble teasing apart the issue before this.
I'm in favor of the alternative approach you outlined. I'm comfortable with my control structures not closing over my variables, as I'm sure many other JS devs are, and I'd be more confused by the existing behavior. There being a syntactic way to semi-simulate block level scope seems like the best of both worlds.
I'm curious, though: why overload the do
keyword? It's a pretty, two-letter word, but maybe doesn't communicate its intent clearly.
@geraldalewis I don't want to get too deep into the issue 788 discussion, but in my opinion, do
(which is reserved but unused by CoffeeScript) is a pretty readable way of saying "run the following function, passing in these arguments." Maybe run
or call
would better communicate the intent than do
, but that would mean reserving new keywords.
The full form is do (x = value) ->
, but do (x) ->
is shorthand for do (x = x) ->
; that's how do (i) ->
above, both declares a function argument named i
and passes in i
to the function.
@TrevorBurnham - ahh -- I'd missed that do
was unused in CoffeeScript; I don't use do
regularly. Makes total sense now. Thanks for the explanation :)
passing in these arguments
This isn't a part of the proposal (nor the current implementation in Coco).
@satyr Any particular reason? I'd suggested the shorthand at issue 788.
Any particular reason?
The purpose of do
in that proposal is to write less parentheses--nothing more.
@satyr Thanks, I've clarified the use of do
in the original post and raised a Coco issue. (Update: And now that Jeremy has added your implementation of do
to master, I've copied that issue over to CoffeeScript as issue 960.)
I don't have any philosophical problems whatsoever with introducing block scope. In my eyes, it's less of a deviation from Javascript then the compulsive list comprehensions. But since obviously it's really hard to get this right, I agree that it would be better to drop the feature in favour of simplicity.
I think I like the idea of the do
statement, which reminds me of the let
statement in Lisp, ML and probably a bunch of other languages. It clearly communicates the idea that a new scope is created with some bindings and then a block of code is executed within that scope.
I'm wondering about the best syntax, though. Should the arrow be mandatory? What about the argument list. For example, one could imagine
for i in [1..3] do
setTimeout (-> console.log i), 0
to be shorthand for
for i in [1..3]
do (i = i) ->
setTimeout (-> console.log i), 0
which in turn would expand to
for i in [1..3](%28i%29 -> setTimeout %28-> console.log i%29, 0)(i)
Is this too confusing, going too far? Maybe the arrow should be mandatory in order to indicate that a function is created. But the special form for i in [1..3] do ->
, in which do
would expand to do(i = i)
as a more explicit way of getting the current behaviour back might be useful.
Another question I'm asking myself is whether to use let
as the keyword instead of do
, which would immediately be familiar to Lispers. I don't see why do
has to be tagged indefinitely as a keyword if it's not used, so the argument about not wanting to introduce another keyword does not quite convince me.
@odf Good points. I think that the proposed for...do
, while appealingly succinct, suffers because it would make break
, continue
and return
fail to act as you'd expect them to in a loop setting (not to mention arguments
). By making the variable capture construct an explicit function, you can continue to interact with the loop in ordinary ways outside of it.
As to let
, I'm not a Lisper myself, but do f
as a synonym for f()
makes more sense to me than let f
... besides, do
has to be tagged as a keyword because it's not a valid variable name in JavaScript.
@TrevorBurnham Yes, I think I agree that the for...do
form would be too confusing. But what about for...do ->
? I'm not sure myself, honestly, but it does have some appeal.
I didn't realise that about do
in JavaScript, but I still don't feel the fact that it's already tagged makes for a strong argument. I think it's a good word for expressing what it would do in this proposal, so I'm all for it. On the other hand, it might confuse people coming from languages where do
is just syntactic glue introducing a loop body.
Yes, I suppose for ... do ->
would be fine, with do ->
being shorthand for do (i)
or do (key, val)
or whatever.
It's true that do
is almost always paired with until
, but I don't see that as a big problem. I guess I wouldn't really care if let
were used instead.
Thanks for detailing this issue, Trevor. On master, the current situation is this:
There's a new syntax for "scoped functions", that act like the closure-wrapped functions, or a forEach
loop. You write it like for variables in object ->
For example, one of the test cases:
obj = {}
for method in ['one', 'two', 'three'] ->
obj[method] = ->
"I'm " + method
ok obj.one() is "I'm one"
ok obj.two() is "I'm two"
ok obj.three() is "I'm three"
I'm fine with the scoped for
syntax, as an extra sugar for a common use case (and I'm glad it uses an explicit ->
), but I really, really want the enhanced do
syntax. Soooo badly. Because sometimes, you want to explicitly enclose some variables but not other.
The do
syntax I suggested at issue 960 is so much more readable than the mess of parentheses currently required to define and call a closure in-place. It's simple, versatile, and free of pitfalls.
I like the for...->
very much. Explicit is better than implicit, as they say in Python land.
Like TrevorBurnham, I'm still very much in favour of the proposed do
statement, tough. It's a clean syntactical encapsulation of a common use-case that is otherwise tedious to express. Having the do
would make code that needs to close over certain values but not others DRYer and more readable.
I've talked about this a bit with Trevor out-of-band, but it's currently my opinion that loops "that need to close over certain values but not others" are an anti-pattern. Those loops can always be written in a clearer, more structured way, and it's pretty nasty to have to check a declaration to determine why some variables are being shared across iterations, and some variables aren't. If we can keep you from having to bother dealing with that whole can of worms, in CoffeeScript, I think we'd be better off in the end.
The do
proposal is not just about loops. The new for...->
syntax solves an important special case, but it's not the whole story. Ad-hoc functions (defined once and called immediately) that return inner functions closing over local variables are an invaluable data-encapsulation tool in JavaScript. But it's a tedious and not very readable idiom, which is why I think special syntax for it would be a great idea.
Anyway, here's what we can do right now, without implicit loop scoping or the new for...->
syntax:
scope = (args, f) -> f(args...)
for i in ['one', 'two', 'three']
scope [i], (i) ->
setTimeout (->
stateWinner =
if i == final_i then "the winner" else "but #{final_i} won"
console.log "I am #{i}, #{stateWinner}"
), 0
final_i = i
Note that I introduce a new scope only where I want it. I can still do anything I want in the rest of the loop, including calling pure statements and accessing arguments
. This is already pretty sweet. But I'd like it even more if instead of scope [a,b,c] (a, b, d) ->
I could simply write do (a, b, d = c) ->
. It doesn't save a lot of typing, but it is much clearer, avoids repetition (which I consider an anti-pattern in itself), and turns a somewhat obscure-looking idiom into a first-class language feature.
(Edit: Moved the final_i
assignment out of the loop body.)
So, running with the wrapped return value proposal, I present a more formal proposal, one that I believe is actually the "perfect" solution Trevor mentions above. The solution currently on master makes the user think about when they need to closure-wrap their loops, and we shouldn't make them have to do that.
Take the following contrived example showing all proposed features: a function definition, a return
, a continue
, and a break
all within the loop body.
fns = []
for k, v of obj
continue if k is 0
break if arguments.length is 1
fns.push (a) ->
console.log k, v, arguments
return k
return k if k is 2
This example should compile to
var fns, k, v, __fn, __result,
__return = {}, __break = {};
fns = [];
__fn = function(__args, k, v) {
if (k === 0) return;
if (__args.length === 1) return __break;
fns.push(function(a) {
console.log(k, v, arguments);
return k;
});
if (k === 2) return [__return, k];
};
for (k in obj) {
__result = __fn.call(this, arguments, k, obj[k]);
if(!__result) continue;
if(__result === __break) break;
if(__result[0] === __return) return __result[1];
}
Remember that this is the most complex example. Simpler function definitions will produce simpler JS output. See the next examples below.
fns = []
for k, v of obj
continue if k is 0
fns.push (a) ->
console.log k, v
This example should compile to
var fns, k, v, __fn;
fns = [];
__fn = function(k, v) {
if (k === 0) return;
fns.push(function(a) {
console.log(k, v);
});
}
for (k in obj) {
__fn.call(this, k, obj[k]);
}
Features/optimizations that you may have already noticed:
continue
s don't even have to be handled at all, ever__return
and __break
only have to be defined if they are being used in a closure-wrapped loop bodyarguments
doesn't have to be passed (as __args
) if arguments
is not mentionedThis proposed transformation process is completely transparent to the user, and will provide them with consistent, expected values in the closure-wrapped variables. Some reasoning as to why this is the best choice for semantics, taken from #coffeescript:
PS: For the inconsistency with mutability of the loop variables, see my still-valid solution on #950.
Render me impressed.
Just one question, though: if this is implemented, would not users expect the bodies of if..then...else
, switch...when...then
statements etc. to implicitly create their own scopes in a similar way?
I think the consistent way to go about this would be to either make the language fully block-scoped or else leave it function-scoped the way JavaScript is, and introduce a convenient, readable way of introducing a new scope. The latter approach is obviously much easier to implement, and stays closer to the target language. If the generic do
doesn't happen, I think the new for...->
loops are a good compromise.
@odf: Can you give me an example of how a user would see different scope-related results when using an if...then...else
than they would expect with knowledge of how the for
loops work in my proposal? Not attacking your argument, just want to know exactly what you mean.
That's a fair question. Here is some silly example code: f1 = (n) -> g = if n > 0 m = n / 2 -> m else -> 0 m = 2 g
scope = (f) -> f()
f2 = (n) ->
g = if n > 0 then scope ->
m = n / 2
-> m
else scope ->
-> 0
m = 2
g
console.log f1(3)()
console.log f2(3)()
Because of the default function-scoping, the first console.log
statement prints 2
, which is the value of m
when the function returned by f1
is executed. But with the implicitly wrapped loops, if a user wrote
m = n / 2
-> m
within a loop value, they would see the function in the second line close over the local value of m, and probably expect the same behaviour here. In other words, they would expect f1
to behave like f2
, which returns a function that returns 1.5
.
Michael, I'd be OK with an implementation of for
that always closes over loop variables without introducing inconsistencies, and with making loop indices consistently immutable. But neither of those changes is going to be implemented in the next 2 days. I think you should maka a separate issue for post-1.0 discussion of your excellent proposals.
Going back to do
, I still say it isn't just about for
loop variables, that there needs to be a succinct syntax for explicitly capturing variables anywhere.
Suppose that every loop iteration I get a new x
; the loop ends when the sum of the x
's goes over a certain limit. I need to asynchronously get an encryption key to save each x
along with the sum to that point. With current structures, this would have to be written as
while sum < limit
sum += x = nextNum()
saveXAndSum = (x, sum) ->
getEncryptionKey (key) ->
saveEncrypted key, x, sum
saveXAndSum x, sum
There are three problems here: Non-linearity (defining a function right before the only time it's called), repetition, and having to name a pretty self-explanatory 2-line function. With do
, the code can be made much nicer:
while sum < limit
sum += x = nextNum()
do (x, sum) ->
getEncryptionKey (key) ->
saveEncrypted key, x, sum
Surely this is both more readable and more writeable?
@TrevorBurnham: A few things:
In your example above, you can just do the following when you're only using the function in one place.
while sum < limit sum += x = nextNum() ((x, sum) -> getEncryptionKey (key) -> saveEncrypted key, x, sum )(x, sum)
I agree it is a little ugly, but do
really is unnecessary, as jashkenas keeps saying.
Michael, if you agree that people might have reason to write
((x, sum) ->
getEncryptionKey (key) ->
saveEncrypted key, x, sum
)(x, sum)
then surely you can see why allowing
do (x, sum) ->
getEncryptionKey (key) ->
saveEncrypted key, x, sum
would be an improvement? Imagine if you had two or three such nested structures!
Trevor, it's an alright feature, and I'm not vehemently opposed to it, but I do think it's a little unnecessary. I'd lean slightly toward not including it if given the option. But only slightly. I'd like to see some cases where it really cleans up some real-world code. I think it'll be hard to find such an instance.
The do
proposal is about beautifying common code pieces, in the same way that we allow braceless objects. If we can pay great effort to strip 2 bytes from {k:v}
, why not paying a little more to do the same for (->)()
?
I think that Trevor's example is a great place that demonstrates how using more structured code helps avoid having to worry about variable scoping bugs:
# Save a sum to an encrypted account.
saveSum = (x, sum) ->
getEncryptionKey (key) ->
saveEncrypted key, x, sum
# Save sums in increments, up to the limit.
while sum < limit
sum += x = nextNum()
saveSum x, sum
@jashkenas I think this is a great demonstration of how more 'structured' code can actually be harder to read. The name saveSum
is not self-explanatory, whereas the actual code, which is not that much longer than the function call, is. I don't think programmers should be forced to either use an unreadable idiom or else tear their code apart into tiny pieces with arbitrary names stuck onto them. It should be up to them to decide when a function body is getting too long or complicated, so that it's better to decompose it into more elementary parts.
Jeremy, while I appreciate your efforts to promote a clear, consistent coding style, I think that the form you propose—even if better in this case (and me and odf would disagree)—becomes untenable in heavily asynchronous code where there are several layers of nested callbacks.
Let me extend my example slightly, by saying that we need to save x
and sum
to a set of servers that come from an opaque iteration object, dependent on x
. Then with do
, the code could be written as
while sum < limit
sum += x = nextNum()
do (x, sum) ->
while s = nextServer(x)
do (s) ->
getEncryptionKey s, (key) ->
saveToRemote s, key, x, sum
By organizing the code in this way, it's not only nicely succinct (which is a great benefit to writeability—I always feel slowed down by having to name functions whose code is self-explanatory), but also makes the scope of everything quite obvious. From the saveToRemote
line, you need only look up to the nearest (x) ->
to see the scope of x
.
Now here's the "proper" way to write this without do
:
# Save a sum to an encrypted account
saveToServer (s, x, sum) ->
getEncryptionKey s, (key) ->
saveToRemote s, key, x, sum
# Invoke saveToServer for all relevant accounts
saveToAllServers = (x, sum) ->
while s = nextServer(x)
saveToServer s, x, sum
# Save sums in increments, up to the limit
while sum < limit
sum += x = nextNum()
saveToAllServers x, sum
We've doubled the line count, and in my view, we've made the code much more daunting. We've defined things in the opposite order of when they happen, which makes sense when a function is called from several places, but not when it's particular to a single loop. Also, with each level of nesting, we get more and more arguments to each function. I've had functions with 5-6 levels of callback nesting, each adding more variables, which makes for some very messy function declarations.
Granted, all the structure added by defining functions outside of the loop makes scope easy to keep track of, but at a very heavy price in all the other virtues that make CoffeeScript such a grand language.
In closing, Jeremy, all I want for Christmas is do
.
Nice example, Trevor! I was talking more in terms of generalities, by the way. How I'd write a piece of particular code depends a lot on the context, so I wouldn't dare to decide what's more readable just based on a tiny snipped. The point was simply that with a small piece of code like that, it is often not the best option to tear it apart artificially. But I think your new example demonstrates that much better.
By the way, I've noticed that there appears to be a let
statement quite similar to what you've proposed for do
in more recent versions of Javascript. But the various implementations and proposed standards confuse me, so I have no idea if this has chances of being in the mainstream five years from now or not.
Yes, the proposed let
is very similar, and I think its popularity among the ECMAScript Harmony crowd is further evidence for the usefulness of do
. Sure, we could wait for let
to become a part of every target environment—but it's going to be a long, long wait before everyone's on IE10+...
No, I wasn't proposing to wait for that. More to suggest it as evidence that this construct could be useful, after all.
Trevor, looks like you get your wish. SHA: 094b876a384bed5504f513c472e4c9cfebe56aea
I'm not entirely happy with the verbosity do
requires ... but it really is the more general solution. It makes properly scoped comprehensions an orthogonal issue that we can work out at our leisure. So, now on master, we have:
for fileName in list
do (fileName) ->
fs.readFile fileName, (err, contents) ->
compile fileName, contents.toString()
And:
do walk
... the reason for removing scoped loop literals is that they didn't really cover all of the possible cases, and do
does.
Thanks, Jeremy, I'm glad to hear it. One small tweak suggestion: How about making a special case in the parser to allow
for fileName in list do (fileName) ->
rather than requiring then
or extra indentation there? That'll slightly mitigate the verbosity for a common use case.
Also, for the record, what cases did the auto-scoped for
proposed by Michael fail to cover? Or were you referring to the while
/until
examples I suggested? The only inconsistency in the proposed super-for
described above that I could spot was that
lastNonNull = (list) ->
result ?= x for x in list
return result
would continue to work, but it would fail if the loop contained a closure, since result
would have loop scope.
The cases that the auto-scoped for failed to cover were while
, until
and loop
, naturally, and it didn't look like it would be easy to add those to the grammar. The bigger issue is that do
is more general, and more orthogonal to if we decide to properly block scope loops later on ... whereas scoped loops would have been made obsolete immediately.
We may make a special case in the parser later, but for now, let's stick with then
.
At last I'm starting to use coffeescript (and really like it). As a Rubyist, I was naively surprised that coffeescript does not create a closure on for
(anymore).
Just add my "+1" for the simpler notation "for... ->". I read the whole thread and understand why the "do" was introduced (good idea), but I would really like an even shorter notation, in particular one that doesn't require me to repeat my parameters... Personally, I would close all my for
s.
Thanks!
Thanks
When I noticed that the current version of coffeescript doesn't make a new scope for the loop variable, I wanted to make an issue about it, but it looks like this has been discussed to death already. It didn't occur to me that this would alter the behavior of return/break/continue, and that alone is a good reason my intuition is wrong here.
I think the real solution, if you want a closure for your loop variables, is to use map or each, which are provided by underscore.js in case your environment doesn't already have them. If you need to exit early (break) you may be better off using detect, or you could use an exception.
Also, I was shocked when I found that Python has the same problem here. I thought Python was usually a step above Ruby in terms of technical correctness, but Ruby 1.9 actually does allow loop variables to shadow outer scopes. Then again, Ruby loop bodies really are functions, so perhaps it's not a fair comparison.
This is an attempt to tie together a few recent issues and come to a consensus on the current feature of closing-over variables in a loop.
The purpose of the feature
Originally proposed in issue 423, the feature is nicely illustrated by this test case:
If converted straightforwardly to JavaScript, this will generate the output
Why? Because the closure doesn't get invoked until after the loop has finished executing, by which point
i
is 2. The closures always references the samei
.But because of this feature, you get the output
Now the closure references the loop's indices (and other variables with "loop scope"—see issue 948) as they are when the closure is defined. This is implemented using a function called from each loop step:
Inconsistencies resulting from this feature
Unfortunately, there are several problems with this implementation, most notably issue 954: The transformation doesn't work if there's a
break
,continue
orreturn
anywhere in the loop. So the codegives you
This is highly unintuitive: Why would the addition of a control statement affect the code's output?
Another inconsistency, issue 950, is that loop indices are immutable when this transformation occurs (that is, when there's a closure and no control statements in the loop), but are mutable otherwise.
Yet another inconsistency, pointed out by satyr at issue 728, is that
arguments
won't work within thefor
loop if the transformation occurs: It will reference thearguments
of_fn
rather than the arguments of whatever function thefor
loop exists in.A more sophisticated implementation could correct these problems, but it would not be simple or elegant, and it's certainly not going to happen in the next few days.
Philosophical problems with this feature
Even if it weren't for the inconsistencies pointed out above, would we want this feature? It's a pretty dramatic transformation on the CoffeeScript compiler's part, and makes an unintuitive leap from JavaScript's semantics.
While it may be the case that you usually want a closure in a
for
loop to reference surrounding variables as they were at the time the closure was declined, this isn't the way that closures behave anywhere else in the language.One of my favorite things about JavaScript is that only functions have scope. It's a nice, easy, consistent rule, one that lets you easily determine scope in CoffeeScript by looking for
->
and=>
(andclass
). There have been several attempts at giving loops special scope in CoffeeScript, but all have proven problematic. I think loops should be kept simple.Alternatives
Let's look at that original example again. How do we say "let's preserve the value of
i
when the closure is invoked"? The simplest approach is to write our own explicit closure,Of course, the syntax here is terrible. There was a better syntax that was proposed at issue 788 and had fairly wide support, not to mention a working implementation courtesy of satyr. If a proposed variation on this syntax were added to master, then you could write
where
do (i) -> ...
is shorthand fordo (i = i) -> ...
, which in turn is shorthand for the current((i) -> ...)(i)
.This, I believe, is the perfect solution: A simple, succinct way of explicitly stating which variables you want to capture. This is a much better fit with CoffeeScript's philosophy of letting you write syntactically elegant code with JavaScript's semantics.
Conclusion
The implicit loop closure that's been in CoffeeScript since issue 423 should be removed due to the unpleasant surprises listed above. Even a perfect implementation would not, in my view, be worth the increased distance between the CoffeeScript and the underlying JavaScript. And the
do
keyword, rejected as insufficiently practical, should be reconsidered as an elegant way of capturing variables.