keystonejs / keystone-classic

Node.js CMS and web app framework
http://v4.keystonejs.com
MIT License
14.64k stars 2.21k forks source link

Keystone.Email should present a consistent async interface #329

Closed moll closed 10 years ago

moll commented 10 years ago

Hey,

I was skimming through Keystone.Email and saw that it presents an inconsistent interface when it comes to calling any given callback. They're sometimes called sync and sometimes async. Zalgo at play!

For the benefits of a consistent interface, please see:

Cheers

JedWatson commented 10 years ago

@moll I'm very sold on consistent interfaces!

Where are you seeing inconsistent interfaces in Keystone.Email?

The Email class could probably use a good review (not to mention some documentation on keystonejs.com) but on cursory glance it doesn't look like the methods are inconsistent:

new keystone.Email(options) is the constructor Email.prototype.render = function(locals, callback) is async Email.prototype.loadTemplate = function(callback) is async Email.prototype.compileTemplate = function(callback) is async Email.prototype.loadTemplate = function(callback) is async Email.prototype.compileTemplate = function(callback) is async Email.prototype.send = function(options, callback) is async

All you really need to call is the constructor and .send(options, callback) - everything else is handled internally, but available externally for edge cases (such as rendering an email without sending it, e.g. for a preview UI)

... am I missing something?

moll commented 10 years ago

As I last inspected, errors were returned synchronously. Also, when an email's template was already in the cache, compile (or something it was called) returned it synchronously. I just started inspecting the call stack from Email.prototype.send and followed it through.

Having a callback argument isn't enough to ensure asyncness — you've got to make sure every path out of the function invokes the callback at least in the next tick. ;-)

JedWatson commented 10 years ago

Just stepped through it in more detail - aside from errors in the constructor (which are thrown) all errors are passed back in the node style callback(err).

The only values that are returned are when the return statement is combined with calling the callback as a shortcut for an early bail-out based on some condition, e.g.

if (!options.fromName || !options.fromEmail) {
    return callback({
        from: 'Email.send',
        key: 'invalid options',
        message: 'options.fromName and options.fromEmail are required'
    });
}

compile doesn't return anything, just ensures the template is in the cache then calls the callback.

Having a callback argument isn't enough to ensure asyncness — you've got to make sure every path out of the function invokes the callback at least in the next tick. ;-)

100% agree. I'm usually very careful with my code.

If you can point me at a line where it's actually doing the wrong thing I'm happy to fix it, I'm just not seeing it.

moll commented 10 years ago

Sure, the first that I remember now that forced a lot of its callers to also act synchronously was https://github.com/JedWatson/keystone/blob/master/lib/email.js#L238.

moll commented 10 years ago

The algorithm to find this was to go step by step from the send function up and ensure every callback call runs in the next tick. Email.prototype.compileTemplate broke this contract and made every caller synchronous as well.

JedWatson commented 10 years ago

Right. After a conversation with @balupton and reading through those posts again, I see what you're getting at.

Unless I misunderstand this is in an awkward middle-zone @izs doesn't quite answer in his post.

We either already have the template compiled (and can therefore callback immediately), or need to compile the template (which will happen asynchronously, for good reasons).

Under "Avoid Synthetic Deferrals":

these synthetic deferrals should be treated as a code smell. They are a sign that your API might not be optimally designed.

... don't like the sound of that. Also:

Note that performance actually does matter much more often than you probably realize.

... given that an email would usually be sent many times per application instance, I think we fall into this category:

If the result is usually available right now, and performance matters a lot

But the first time it won't be available; we compile on demand for obvious reasons. That's not a good reason to return an error code or push the handling of the deferral to the user. So maybe we can't be in this category after all.

Maybe we're in this one:

If the result is usually available right now, but performance doesn’t matter all that much

... see above about performance mattering more than you probably realise. If you were to loop Email.send() to blast out thousands of emails, I'd like it to perform as well as possible.

I'm not sure the best course of action here.

For all the implementations I've used this in either the callback to Email.send just logs errors while the main thread gets on with returning something to the user (because who wants to wait for an email to compile and send), or the callback takes over the feedback code-path and there's nothing else for it to conflict with. I can see how this isn't guaranteed to be the case with other applications though, and you're right, potential Zalgo.

Littering the file with process.nextTick(callback) solves the problem - maybe that's worth the (probably trivial) performance penalty.

On another note, I'm surprised that mandrill.messages.send doesn't happen asynchronously. From what I can understand, it's not so much that

Email.prototype.compileTemplate broke this contract and made every caller synchronous as well.

... but that nothing else is actually forcing async, other than thefs.readFile in the loadTemplate function.

Maybe the best approach is to wrap the early callback in compileTemplate with process.nextTick so that one's consistent, then always defer the mandrill.messages.send call too so it always happens asynchronously.

Feedback very welcome.

moll commented 10 years ago

Absolutely, I prefer sync over async any day, too, but yeah, if it takes a callback, it better be async always. Otherwise there's no point for the callback, right? ;-)

Maybe the best approach is to wrap the early callback in compileTemplate with process.nextTick so that one's consistent, then always defer the mandrill.messages.send call too so it always happens asynchronously.

Thumbs up from me.

... see above about performance mattering more than you probably realise. If you were to loop Email.send() to blast out thousands of emails, I'd like it to perform as well as possible.

Sure. But write a benchmark first and ensure that contemplation over a single deferred callback is even worth a second of your time. One poppy already said that 40(!) years ago. ;-)

"Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%."»

balupton commented 10 years ago

+1 for eliminating zalgo (when a function that receives a callback may finish either synchronously, or asynchronously)

+1 for performance

I don't think the performance hit would be that bad, considering nextTick doesn't have any idle time associated with it. Somewhere we used setImmediate before (I can't remember the use case). Regardless, the potential issue here is that it could cause code to act in an unexpected way. For instance, the callback of your zalgo async call would execute before the rest of your callee.

E.g. this is the risk zalgo introduces:

// do something 1

somethingWithZalgo(function(err){
  // do something 2
})

// do something 3

The issue is that "do something 3" could execute before "do something 2" if the zalgo method executed asynchronously, or the opposite if the zalgo method executed asynchronously. This can cause an unaware user to spend a while debugging some unexpected results.

This is only an issue where the function could run synchronously or asynchronously. If a function that receives a callback always executes synchronously, that's fine (e.g. Array.prototype.filter), if a function always executes asynchronously, that's fine setTimeout. If a function can do either or, that's the issue.

isaacs commented 10 years ago

The worst possible thing you can ever do is release Zalgo.

Any other hazards, smells, downsides, etc. are nothing compared to the hazard caused by having a maybe-sync interface.

Either be always async (and process.nextTick when the output is already cached), or be always sync (and throw errors when the output is not cached, requiring the user to do some async load() method to prime the cache and try again).

But never ever have the same function take a callback, and sometimes call it right now (before returning), and sometimes call it later (after returning). That releases the Dark Pony of Madness.

JedWatson commented 10 years ago

Thanks for the feedback / education guys, much appreciated, and @moll for pushing it in the first place.

I've done some more reading since and found Caolan has a good write-up on it too in his nodejs style and structure post including a specific example with caching. Nowhere near as entertaining as @isaacs's though ;-)

The Email class should be safely always-async now, I'll step through the rest of Keystone too and see if this could happening anywhere else.

Cheers!