Closed timbertson closed 14 years ago
Here's a direct link to the test_deferred.coffee suite, for folks who'd like to take a peek:
http://github.com/gfxmonk/coffee-script/blob/deferred/test/test_deferred.coffee
Exciting, I hope to try this soon, great work!
I couldn't get this branch to compile, http://gist.github.com/387224. Any ideas? I am sick of nesting code and using 3rd party libs like step and flow for this so I am anxious to try this.
@mrjjwright: might have been a bad merge, I'll check it out tonight and let you know.
done! (just pushed). Not entirely sure what the problem was, but it seems to be sorted now. Apologies for that.
It compiles now, thanks.
I tried it on a place today where I was planning to use flow.js. It worked fine. I liked it because I didn't need a library to pull this off. I am always worried that there is a hidden bug in flow.js or step.js. There could of course be bugs in the defer code as well but I am assured that all the code is right there to inspect if it is wrong, and it clear what it is doing. Both flow.js and step.js use underlying state machines that are harder to debug.
The defer keyword definitely illustrates at least theoretically the beauty of CoffeeScript to provide a language solution to an awkward JS usage problem. A developer would otherwise use a library but I think directly generated code is a lot easier to inspect and trust than libraries. I don't understand why developers are reluctant to try CoffeeScript but will gladly pop in any third party library that is barely documented.
I don't want to get too addicted to this yet but I can see how I could easily do so.
+1
mrjjwright: for the sake of discussion, it would be great if you could paste in a bit of one of your real-world conversions to the "defer" syntax. I think a lot of the discussion that we'll have will be better informed by "before" and "after" examples with real code...
Ok I am going to try this on what is surely a great test case, a SQLite table migration where you dump all the results of one table to another and then re-import them. This involves about 15 levels of nesting as well as a nested function definition in a very deep defer. I have this working in flow.js and tried migrating it to defer but running into a compilation issue that I have pasted in the comments below the gist (I could be doing something really naive but couldn't find it).
I will probably need your help gfxmonk to get this compiling.
If this were in coffeescript, I'd be using it now. Even with the all the virtues of asynchronous programming there are still many of things that need to be done serially. This makes clear, readable code out of it.
+1
gfxmonk helped me get it to compile and it worked like a charm.
A few things to summarize from my experience.
jashkenas I am wanting this, it's going to be hard to not lobby you. But this is my first day with this and I haven't heard or read all the naysayers yet.
I made a suggestion about syntax in an old thread - I'll just repeat it here ... .... instead of
[err, data] = defer get "/", {} my_async(x, y) [err, data] = defer more a, b setTimeout(defer, 100) final()
use a loong arrow
[err, data] = get "/", {}, --> my_async(x, y) [err, data] = more a, b, --> setTimeout -->, 100 final()
I think it neatly encapsulates the idea of 'look to the next line' and also that it's a bit like a function.
mrjjwright - you might be interested reading the old threads for discussions we had around naming. async has been suggested, but I don't really feel it explains what the keyword does. "call-with-current-continuation" is the best name for it - that comes from smalltalk, but even there it was contracted to call_cc in actual code (which is almost meaningless)
I've been discussing this with a friend, and a couple things should also be pointed out as shortcomings:
weepy Being a python guy, I typically prefer meningful words over punctuation. Plus, I think your suggestion makes it too easy to accidentally start a function (by forgetting a "-").
But if we move to having the "defer" keyword replacing positional arguments (rather than prefixing function calls) then defer wouldn't necessarily be a great choice, so suggestions are welcomed.
oh, and:
Will future readers of my code easily understand defer?
Most people dealing with javascript callbacks would already understand continuation-passing-style, and hopefully most have longed for or thought about a way to make that automatic. So hopefully this feature should seem like a fairly natural transformation to most javascript programmers that would encounter it.
by way of further example, here's an actual javascript file I wrote using lawnchair (a heavily async datastore). I've ported it to both coffe and coffee+defer (reasonably idiomatically, aside from some unimportant string concatenation):
http://gist.github.com/389375 it's easier to see the difference if you open them side by side in a text editor - the bottom "run" method in particular is far more straightforward with defer.
Hey gfxmonk,
Ok, defer is fine as a keyword. I defer.
Looping issues, no big deal either, I am always really careful about async calls in a loop. (Btw, ever seen this: http://glathoud.easypagez.com/publications/tailopt-js/tailopt-js.xhtml. Any value here?)
Will you be merging all major bug fixes into deferred from the main line for a while so I can use this branch? I am kind of hooked. If not, I will go back to my old crack async dealer, flow.js.
How about wrapping setTimeout for numerical values? E.g.
defer 50 # do something in 50ms
You might counter that it wouldn't work for variables that are numeric, but 99% of the time setTimeout's are used with a hard coded numeric value.
weepy: I think that's more of a library issue - special casing in the parser to deal with deferred numbers instead of deferred calls seems a bit of a strange way to handle it. You could do this with a simple library function:
sleep: (delay, cb) -> setTimeout(cb, delay)
then just use it as:
defer sleep 50
mrjjwright: I plan to keep the deferred branch up to date (maybe at a lag of a week or so, because merging is not exactly my top priority). However I can't promise that the syntax won't change (that is, after all, part of the point of this issue / discussion). Having said that, I plan to port my currently in-progress javascript-heavy webapp to using defer when I get the time.
I'm so giddy with excitement over the possibilities for readable nodejs code (as an example)
Okay, I've just pushed a fairly significant change, which cleans up and clarifies the tricky bits of the original attempt. I now believe my branch is in a good state to consider merging into the master. Pity that jeremy has just disappeared for a couple weeks, but at least I shouldn't have to do too many more big merges while he's gone (they are not much fun).
If you want to see the differences, you can take a peek here: http://github.com/gfxmonk/coffee-script/compare/master...deferred#diff-8 (nodes.coffee is where most of the changes have been made)
It's a big diff, but it's also the first change of its kind. Most of the CS compiler translates one thing into a more primitive version of it (i.e javascript). The deferred machinery, on the other hand, does a lot of rewriting nodes into a manner that will work when calls are "resumed" after an asynchronous call.
The first part of the machinery is basically to pull out deferred calls to the beginning of an expression. That is, if a complex expression contains a deferred call then that call is executed first, and the rest of the expression gets evaluated inside the callback provided to the function. This makes all deferred results available at the time when they are needed - because you can't just pause execution in the middle of an addition operation, for example.
The second (and quite unsightly) part of the machinery is the rewriting. Just as you can't pause execution in the middle of an addition operation, you also can't fire off a call from within an if block and expect your control-flow to still work when your callback is called. In fact, the callback would have to include the rest of the if branch, as well as any operations that sit after the end of the if block in the original code.
So that's where make_control_flow_async comes in. This method is implemented on all nodes. By default it just propagates the call to its children. But in the case of nodes that affect control flow (IfNode, WhileNode, ForNode), it does something pretty invasive. Basically, there's a way to transform each of these nodes into a "flattened" version which manages control-flow by explicit continuations. This method (for each of those nodes) generates the CoffeeScript node objects after that transformation is applied. Because they build up a reasonably different source tree, I've introduced a couple of builder objects (down the bottom of nodes.coffee). I can move them into another file if people feel they clutter the already-huge nodes.coffee file. These builders have named methods for common rewriting operations that make it more obvious what transformation is happening, and why.
I'm happy to explain specific parts of the implementation, if people are interested.
I just came across this; it's very exciting. I'd like to share my vote for positional argument syntax:
defer setTimeout(continue, 100)
The defer
introduces the call that is being deferred to, and (conceptually) binds continue
to the continuation that begins on the next line. If you are using this syntax, it is because the function being called focusses on the callback, and I think continue
is the clearest description of what the continuation-as-callback does.
Obviously, this could be a problem since continue
is already a statement, but I don't think there will be overlap, and it is kind of similar, both being control constructs. Other possibilities are resume
, continuation
, or cc
.
mckeed: continue could be confusing because of the existing use (even if it's not ambiguous to the compiler).
jashkenas: any thoughts on the possibility of merging this in sometime? You haven't said a word on this since you came back, not sure if you're busy elsewhere...
I was wondering about an alternative syntax using the <-
symbol. Something like
err, data: mongo.find {user: 'nicky'}, <- throw err if err process data
So the idea is that the arrow points the other way indicate that the tree branch is being unwound, but keeps parity with the normal CoffeeScript. It also allows for arbitrary positioning of the callback, e.g.
setTimeout <-, 100 run_delayed_code()
I think it works nicely because it still mostly looks like normal coffeescript (I found defer to be a bit ugly)
Hey gfxmonk. I've been trying to knock out most the smaller tickets before tackling the big ones, like defer
.
There are a couple things you could do to facilitate this...
test_deferred.coffee
is supposed to test all of the possible scenarios, it would be good to have a reference Gist that shows the ways you are supposed to use deferred.I use =>
quite alot, so it would be important (for me) to be able to be able to specify that the function should be bound.
@weepy: I think it works nicely because it still mostly looks like normal coffeescript (I found defer to be a bit ugly) It's probably unwise to make it look too much like coffee-script, as it still has different semantics (return, for example). I personally get too confused by a language having every imaginable type of ascii arrow, so I'd prefer to steer away from "<-".
As for "=>", I'm not sure what it would mean for a deferred function to be bound - that's about function definition, not a function call...
@jashkenas: geez, that's a lot of changes. I get distracted writing android apps and all the underscores die ;)
ah - after seeing your examples, i can see that my comment about => isn't really appropriate. so am i right that:
this
?
it would be great to see some more of those examples you've got there - perhaps along with the exact output JS ? -- so we can get a good feel for it.gfxmonk: Can you explain how defer
changes return
semantics in your branch? (Presumably other semantics like break
and continue
) after a return as well... I can see how they'd be problematic, but I'm not precisely sure how you're handling it.
weepy: yep, that is correct.
As for this
, it's what you'd expect - it's restored after a defer - that is, whatever this
is at the start of the function is maintained after making a deferred call. this
inside a deferred function call will be whatever it normally is (i.e you can use the fat arrow to bind this at definition time, as you already do).
The exact JS output is not exactly beautiful, but I've now added it to the gist for illustration's sake ( http://gist.github.com/445525 )
jashkenas; regarding return. With defer, you're still writing your functions in a callback style. That is, you take a callback, and you call it with one or more arguments instead of returning a value. The defer machinery makes this look nicer on the call side, so that you don't have to pass an actual function as the callback, it is constructed for you.
Keeping that in mind, there's no difference in return semantics to asynchronous coffee-script. If you return
instead of calling your callback, execution will silently cease. So it's the same as normal async code, but that's obviously different to plain-old-procedural coffee-script, which is why I was discouraging making that the defer resemble procedural code too closely.
I'm happy to report that continue and break are indeed problematic, but as far as I know, they are handled properly in all cases :)
gfxmonk: looks like a couple of things need to be cleaned up in the generated code. To quote:
return _h(undefined); //continue;
return undefined;
});
};
return _h(undefined);
return _g(undefined);
No need to double-return in either of those places, is there?
Also, for the common-case defer, it would be great to write this:
myFunc = function(callback) {
var _a;
someFunction(1, 2, 3, function(_a) {
var result;
result = _a;
doSomeThingWith(result);
return callback(true);
});
};
As this, without the variable juggling:
myFunc = function(callback) {
someFunction(1, 2, 3, function(result) {
doSomeThingWith(result);
return callback(true);
});
};
whew, that was a tricky one. all merged and pushed now :)
hmm. isn't it confusing that the return doesn't actually return - even though it looks a bit like vanilla coffeescript ?
I think the comprehension parallel processing is cool (something that's rather ugly to do by hand - and I have to do it quite alot) - but I'm not entirely sure what else defer brings to the table, other than "I don't need to indent my callbacks" ? Perhaps I'm missing the point .....
Weepy: defer's usefulness doesn't really show up until you've got lots of calls using it from the same function.
do_stuff: (input, outout_style, callback) ->
[data, coding]: defer parse_input(input)
processed_data: defer process_data(data, coding, 'random_argument')
output: defer generate_output(processed_data, output_style)
callback(output)
is it really anymore readable than
do_stuff: (input, outout_style, callback) -> parse_input input, (data, coding) -> process_data data, coding, 'random_argument', (processed_data) -> generate_output processed_data, output_style, (output) -> callback output
weepy:
hmm. isn't it confusing that the return doesn't actually return - even though it looks a bit like vanilla coffeescript ?
i'm not sure what you mean - you still have a callback paramater, so you still need to return into it. I have some ideas about making that look more normal, but don't want to pollute the core feature with optional enhancements before it's merged. This was also a large source of contention between matehat and I during the initial implementation, and we both came to the agreement that this was a side matter that should be dealt with independently (if at all).
weepy (#2): you're right, that example is mostly cosmetic. I still think it's useful, but doesn't necessarily justify itself in terms of complexity. The real advantage of the defer feature comes when you use a defer inside a loop, or a comprehension, or an if branch. I've done that manually myself in plain javascript, and it's utterly ridiculous (and very easy to do wrong).
jashkenas: yes, there's certainly some unoptimised code in there. For the first cut, my approach has been to favour simpler compiler code over concise output. I'm reluctant to do much optimisation just yet, because:
Both of these points would become much less of an issue if the code gets merged into the mainline (I won't have to merge so hard, and changes that break deferred functionality should be found at commit time rather than at merge time).
I'll have a look at the specific examples there, but it may not be worth fixing at this point if it complicates the code futher.
gfxmonk. yes - I agree it's certainly very useful inside a comprehension - it is horrid without. Would you mind providing some more examples inside a loop and some if branches? I think the real use cases are extremely useful in showing off your work. Personally I'm still not sure about the defer in normal javascript outside of the loop: I think it greatly obfuscates the real meaning of the Javascript without really making it much prettier (indentation vs defer keyword).
weepy: I don't have many real examples, the test suite is currently probably the best place to look (the coffee-script really is quite readable, with descriptions of what each block demonstrates): http://gist.github.com/448286 If you search for the description of the test you're interested in, you should be able to locate it in the compiled javascript fairly easily (here's a gist: http://gist.github.com/448286 )
Also I forgot to mention that another nicety of defer is that "defer some_call()" is a valid expression, and can therefore be placed within other expressions, rather than having to structure your statements around it.
I personally feel the lack of indentation is a good thing - especially when you do, say, 5 deferred things in a row (as I have). Your code should read top to bottom, not top to squishy-bottom-right ;). But the main drive behind wanting to do this was for the more complex cases.
gfxmonk: I think we're at a bit of an impasse on this ticket. I can't really merge it back to master until things like the erroneous double-returns and the semantics-of-returning-from-a-defer are fixed and settled. Is this something that you want to tackle, or should we just put defers out of their misery at this point?
jashkenas: I've spent a lot of effort on this, and I really think it's a smashingly useful feature. I'm not directly using CS for anything any more, which is why my intensity has waned a little lately. But I think this could be a feature to really make CS shine for new uses (uses for which there currently exists no useful alternative). So I'd very much like to see it merged in so that everyone can actually start using it (as a handful here have already said they would like to).
with respect to those specific problems you mentioned:
To reiterate, there is nothing wrong or semantically awkward with leaving returns as they are - at least no more awkward than they already are in asynchronous-style code. There are opportunities to clean them up and make them nicer for some use cases, but that can be said about any language feature.
On the subject of optimisations, I've been looking specifically at the duplicate return case this evening. It's doable to eliminate this oddity, but it introduces more functionality that is only for the sake of defer (I have been trying to keep the footprint on the compiler LOC small). It also only eliminates one (admittedly common) use case. Eliminating cases where a variable is unnecessarily assigned only to be returned immediately For that gain, it adds nonessential logic which clouds the essential (i.e strictly required for the feature to function) logic. The logic required for the deferred functionality to even work is nontrivial, and I suspect that adding optimisation code into the mix will confuse matters greatly.
I am hardly a compiler writer, so I'm finding this challenging. I do know that old-school compilers (like gcc) perform compilation in a separate step. That is, code generation is responsible for correct code, and the generated AST is passed to the optimiser which is then responsible for simplifying all the patterns that are unnecessary (for example, double returns, or assigning to a new variable when the original value was already just a value instead of a complex expression). I really don't know if this could work for coffee-script, because it's so dynamic and in most cases very closely aligned with its idiomatic javascript. But I suspect that trying to perform both code generation and optimisation for nontrivial transformations such as defer is going to confuse the logic terribly.
Glad to hear you're still onboard with it, thanks for the notes. Now that 0.7.0 is out, I'll take another stab at a merge and poke around sometime this week.
Let's start with the return
issue. It seems to me that this gets to the heart of defer. Imagine you have a function that uses readFileSync
to return a file:
read: (path) ->
code: fs.readFileSync path
return code.toString()
puts read __filename
It will print out the contents of the file. You want to change it to make it asynchronous, and so you use defer:
read: (path) ->
[err, code]: defer fs.readFile path
return code.toString()
puts read __filename
The return, of course, fails, because even though it's got access to the code, it's caller has vanished because of the defer. gfxmonk: how would you deal with this?
jashkenas: currently, making a function async is not quite that simple. You would also have to add a callback paramater, and call it in your return. That is...
read: (path, cb) ->
[err, code]: defer fs.readFileAsync path
return cb code.toString()
this is no different from current async code, except that there is no explicit second argument given to readFileAsync (aside: you have to use readFileAsync with defer, you cant just use readFile). The defer mechanism constructs a continuation argument to pass into functions that expect a callback argument - it does not (currently) alter returns.
My original pass did change the returns in the calling function, by appending a new (hidden) callback paramater, and changing returns to always call it. I no longer think this is a good idea (because it's surprising and doesn't allow enough flexibility). We discussed this previously, here http://github.com/jashkenas/coffee-script/issues/closed#issue/287/comment/170719
we came to the conclusion (with support from others, not just matehat and I) that adding just the defer keyword would be a good first step, in order to let people get used to the feature and see how they want to use it or what else they need from it. As another practical aspect, there were differing opinions of how to alter returns when used with deferred calls (see that thread for some of them).
I still think it's a good idea to have these issues separated. Having said that, if you don't want to merge in the defer branch without also including a change to the way returns are handled, I still think my second proposal here ( http://github.com/jashkenas/coffee-script/issuesearch?state=closed&q=return#issue/351 ) is the best idea for denoting that returns should happen "into" a callback:
my_func: (arg1, arg2, return cb) ->
value: do_some_stuff()
return value
which will signify that "cb" is the designated return callback, and thus substitute "return expr" into "return cb(expr)"
gfxmonk: but this is the point I'm trying to make. The idea with defer is to make async code appear to be sync, syntactically. return
, and other things like non-final-position callbacks, break this illusion. To go back to our example:
read: (path) ->
[err, code]: defer fs.readFileAsync path
manipulate code
transform code
return code
I understand that this doesn't work -- but look at what the function seems to be visually. To all appearances, it looks like you're still in the body of the read
function. It looks like you should still be able to return a value. Needing to understand that you have to take an extra callback argument, and then call into it, and then use defer from the calling site, is an unacceptable level of complexity for this feature.
Defer is spooky action at a distance, having unobvious effects on the entire remaining body of the function -- and there's nothing wrong with that -- if we can encapsulate it cleanly, and hide the transformations. If it's too hard or impossible to hide the effects of the transformations at the language level, then I can't merge it in with a clean conscience.
For folks who want to play with this, the gfxmonk/deferred
is in a really good place for mucking around with. I recommend checking it out and trying to run this test file:
fs: require 'fs'
[err, code]: defer fs.readFile __filename
puts code.toString()
Since functions cannot return a value (in the traditional sense) after a defer
statement, perhaps it would be prudent, for the moment, to forbid using both return
and defer
within a single function. This would make it easier to visualize what happens inside a function which uses defer
, since a return
statement would never be allowed to appear somewhere unless it represents an actual, synchronous return to the caller. Asynchronous and synchronous functions would be visibly distinct form one another. This would also leave open the possibility of reintroducing a semantically-different return
statement for asynchronous functions in the future.
Making sync and async functions look visibly distinct from one another would seem to suggest using indentation (it's CoffeeScript after all) to delimit the scope of the defer. Here's an example that wants to return the event emitter from readFile
to the caller as well as do work in the async call.
read: (path) ->
eventEmitter: defer fs.readFile(path) err, code
manipulate code
transform code
return eventEmitter
Oops, looks familiar, doesn't it?
read: (path) ->
eventEmitter: fs.readFile path, (err, code) ->
manipulate code
transform code
return eventEmitter
That being, of course, the current way to write it with a callback. In my opinion, defers are valuable if they can make async code appear to work synchronously, by transforming the surrounding code into continuations. If they work so differently that we have to distinguish them visually, then the battle is lost.
Do not want. In our use case we have seperate error and success callbacks, and the syntax is not obvious that you are using an async block - so I'm against.
Adding my two cents here. This adds a layer of complexity on top of CoffeeScript that I'm not sure is necessary. My brain's wired to think of indentions as a callback - not seeing the indentions makes me think it's all happening synchronously.
I agree with @jashkenas' last assessment here. The two blocks of code are almost equivalent with one having an entirely new syntax that doesn't carry of from JavaScript. With the second one, at least I can apply the "this is how to define a function" knowledge inside CS to the the second parameter and realize it's a function, so it must be a callback.
Cool stuff, though, just not sure it belongs at the language level.
I plan to write up a longer post (hopefully over the weekend) on the defer functionality, its rationale and why it's tremendously useful. I'm hoping giving a thorough overview of it will help people understand why I think it's so important. For now, I have a few specific points to respond to:
sethaurus: yep, it's probably a good idea to prohibit returning after a defer. it's always been my opinion that using both is a terible idea.
jashkenas: there are two issues with your trivial example suggesting indentation:
bnolan: are you arguing against defer, or against the suggestion that it should be indistinguishable from synchronous code? In the current defer branch, you would still be taking a success and callback function - is that (plus the use of the very specific "defer" keyword) not sufficient to imply async?
twicegood: disregarding the additional code in the compiler, the extra complexity exists only for people writing async code. You could still write it with callbacks if you wanted, but once you actually write a significant chunk of programming using an asynchronous API, you will not want to.
oh, and also this is unfortunately something that cannot really be done at a library level - I wouldn't be trying to change a compiler if it could be. See async.js for an attempt - it's one of the most clunky and error-prone libraries I've ever tried to use, by no fault of the designer.
previous issues: http://github.com/jashkenas/coffee-script/issuesearch?state=closed&q=narrative#issue/241 http://github.com/jashkenas/coffee-script/issuesearch?state=closed&q=defer#issue/287
So, it's back!
For those who didn't read or can't remember the previous discussions, I have been working on adding a "defer" semantic into coffee-script. This is aimed at making asynchronous coffee-script less painful, by providing a continuation-like callback object for the current function (at compile-time).
For example:
Would be transformed into the following javascript (or its equivalent):
Special care has been taken such that the following use-cases work as expected:
[err, result]: defer some_call()
)These are non-trivial, so there may be bugs if you do something utterly weird - let me know! ;)
So, please do check out my deferred branch (http://github.com/gfxmonk/coffee-script/tree/deferred) and let me know what you think. Is it a good idea? Should it be (eventually, after some more cleanup) merged into master? Are there any glaring omissions or bugs? Be sure to look at test/test_deferred.coffee, it has over 30 different tests ensuring that as many language features as I could think of work when defers appear in various locations.
(note that the tests in this branch rely on my coffee testing project, coffee-spec (http://github.com/gfxmonk/coffee-spec). I couldn't have managed this many complex tests without it, and it's hopefully coming to coffee-script itself officially sometime soon)
Stuff not yet done
defer
in place of a positional argument - it shouldn't be too hard though.So... thoughts?