nodejs / node-v0.x-archive

Moved to https://github.com/nodejs/node
34.41k stars 7.31k forks source link

process.nextTick should be cleared after every v8 invocation #3335

Closed isaacs closed 12 years ago

isaacs commented 12 years ago

Change the nextTick semantics to allow IO starvation. Immediately after every v8 invocation, process all scheduled nextTick callbacks, so that no IO can happen in the intervening time between scheduling a nextTick and any other callbacks being fired.

creationix commented 12 years ago

The behavior I want (which I believe is what @isaacs is proposing) is that after every event source (either I/O or timer) we process the queue of nextTick callbacks. A simple implementation to show what I mean would be

var queue = []; // Obviously this would be replaced with a fast queue in the real code
function nextTick(fn) {
  queue.push(fn);
}

// later after the tick for I/O or timers returns back to the event loop root.
function processNextTicks() {
  var fn
  while(fn = queue.shift()) {
    fn();
  }
}

This is extremely easy to implement and easy to explain. If a nextTick fn registers another, it simply gets put at the end of the queue. All nextTicks will execute before any I/O or Timer events can happen.

The only downside is it's now possible to cause an infinite loop if you recursivly register nextTicks forever. We could easily add a flag in the queue object that records how many generations of queues it processes and throw a warning after some high number. (Or keep it simple and let people hang their servers, they will figure it out quick enough).

I think the name nextTick is fine because it is a new tick. It's just inserted in the event queue after any existing nextTick events, but before any real events. It's an event it it's own right with it's own stack. A recursive nextTick loop will never stack overflow because it would work like a trampoline.

shigeki commented 12 years ago

I'm +1 on @creationix to have a recursive nextTick loop. In the Japanese node.js mailing list, I have ever asked the reason why SIGINT is not working in the below code as

process.on('SIGINT', function (){
  console.log("interrupted.");
  process.exit(200);
});
process.on('exit', function (){
  console.log("exit.");
});
var c = 0;
while(true){
  console.log("foo" + c);
  c++;
}

And I answered him to use the recursive nextTick() not to block event loop as

process.on('SIGINT', function (){
  console.log("interrupted.");
  process.exit(200);
});
process.on('exit', function (){
  console.log("exit.");
});
var c = 0;
function output() {
  process.nextTick(function() {
    console.log("foo" + c);
    c++;
    output();
  });
}
output();

The concept not to stop the event loop is very important for users to make the most of node.js performance. I think that the recursive nextTick() is one of the methods to achieve it.

creationix commented 12 years ago

@shigeki While technically a recursive nextTick loop in current node won't block the event loop, it will still busy-loop the process and is a bad pattern. I'd rather nextTick be able to block the event loop if used wrongly than have the current impossible-to-understand semantics.

creationix commented 12 years ago

Something I've wanted for a long time is node was a way to register an event-source hook. A basic hook would be a callback that I could register to be called after every event source tick (I/O or timers). A more advanced hook would allow me to insert a frame into the stack of every event source.

For example:

process.sourceHook = function (fn, type, args) {
  console.log("About to call an event source handler for %s type event", type);
  var ret = fn.apply(this, args);
  console.log("Done calling event source");
  return ret;
};

This would make it possible to implement something like node domains in pure JS without any modification to node itself (Using a clever V8 stack trace API trick). I could implement my own nextTick right before returning. I could audit the system for async events, log how long my handlers took. There are a million possibilities for this hook. If process.sourceHook was undefined, then it would skip this step and call fn directly internally.

I bring this up in this thread because it would allow node to do what makes the most sense for nextTick by implementing Isaac's proposed change, but anyone wanting more power would have it via this hook.

xavi- commented 12 years ago

I like this change as well. Though, up until I read this ticket (and the corresponding google groups thread), I always thought of process.nextTick as a faster version of setTimeout(..., 0). After this change it's something different, which is okay. The only thing I ask is to please put a warning about possible event-loop starvation scenarios in the documentation. An example would be nice too.

DougReeder commented 12 years ago

Would it be accurate to characterize current nextTick semantics as "after any pending I/O" and the proposed semantics as "before any pending I/O"? If so, would there be a way to achieve the old semantics? My sense is that there still a use for a "setTimeout(..., 0) with less overhead"

shigeki commented 12 years ago

@creationix I agree it is an impossible-to-understand semantics but we should provide novice users with remedies for such a bad pattern. It is good for us to have alternatives Instead of a recursive nextTick loop.

Sannis commented 12 years ago

@DougReeder, is it really breaks in case when process.nextTick(f) is used for setTimeout(f, 0)? I think it should not break it.

@isaacs maybe it is useful to copy reasoning from https://groups.google.com/forum/#!topic/nodejs-dev/DvENgK8Nubw to this issue?


I think that process.nextTick method name is better for set functions that will ececute on after next v8 phase of event loop. It is helpful for things that need to be executed on every event loop tick. If you will change process.nextTick semantics, please add something like @creationix suggested or at least a possibility to attach event handlers on every event loop tick. process.on('tickEnd') should be good solution.

qxfusion commented 12 years ago

to @isaacs maybe create multiple event loop? 1) I/O event 2) Network event 3) Timers and etc. 4) Use defined events

What are you think about it? It's can be more effective for very event intensive applications

creationix commented 12 years ago

To everyone wanting to use nextTick so they can run long-running CPU intensive code in node's main thread: First this is generally a bad idea. Even with manual chunking of the work and letting other events seep in, it's rather complicated and still hurts performance. Node's non-blocking model works great for doing I/O wait in parallel, but stinks for doing CPU work in parallel. At best you can manually interleave work on the same CPU while all the other cores stay idle. For long-running CPU intensive work, it would be much better to use threads or another process.

What's needed to do CPU chunks while the event loop is idle is an idle event. Abusing the current semantics of nextTick works somewhat, but it's not well understood and complicates things for the primary use of nextTick.

A low priority idle tick callback would be called repeatedly while the event loop queue is empty. Once an event comes in, it would stop calling the idle callbacks till the queue was empty again. If you wanted to make the callback slightly higher priority you could configure how many ticks it gets between active events.

process.on("idle", function () {
  // Do a small chunk of CPU intensive work
  for (var i = 1e6; i > 0; i--) {}
});

function afterEvent() {

  // After each real event source, busy loop on the idle events till another event comes in
  while (loop.idle() && process.listeners("idle")) {
    process.emit("idle");
  }
}

This could easily be implemented in user-space if we had my proposed event-source hook or something like it.

isaacs commented 12 years ago

@creationix Really good point: https://github.com/joyent/node/issues/3335#issuecomment-5984731 Thanks. Yes, that's what I was sort of thinking as well. nextTick is not low enough priority to work as an idle event, and it's not high enough priority to work as an instant callback.

creationix commented 12 years ago

@isaacs how would you feel about adding something like my "idle" event example above as a new API so that people depending on the current behavior of nextTick have something official to migrate to? Or how would you feel about having an eventSource hook so that things like this could be programmed in user-space?

Mithgol commented 12 years ago

Abusing the current semantics of nextTick works somewhat, but it's not well understood and complicates things for the primary use of nextTick.

What is the primary use of nextTick then?

How is process.nextTick(f) different from f() after the proposed change? I mean, if both would block I/O and if both would block other events (timer events, for example), then is there actually a case where such process.nextTick(f) would work better than a simple f() call?

Right now the only difference I can see is that process.nextTick(f) requests would still form some queue and thus would be handled in turns (in parallel), where f() blocks even parallel requests as well. But, as @creationix has mentioned, it would be much better to use threads or another process for doing CPU work in parallel anyway. How is this difference necessary then? Maybe it's better to leave process.nextTick(f) as an idle event processor (which it is now) and to use f() whenever you need the proposed (event-blocking) process.nextTick(f)?

Or maybe there is some other difference that I missed. Then, please, explain it.

creationix commented 12 years ago

The primary purpose of nextTick is to defer running some code till your caller has had a chance to register event listeners.

For example, suppose I have an in-memory cache on my database calls:

var cache = {};
function getItem(id, callback) {
  if (cache.hasOwnProperty(id) {
    var value = cache[id];
    process.nextTick(function () {
      callback(null, value);
    });
    return
  }
  doRealQuery(id, callback);
}

Without the next tick then code calling this function will have it's callback called even before the getItem function returns. This matters even more for eventEmitters.

function getEmitter() {
  var emitter = new EventEmitter();
  process.nextTick(function () {
    // we can't emit anything before we even return the object to our caller, so defer using nextTick
    emitter.emit("start");
  });
  return emitter;
}
qxfusion commented 12 years ago

to @isaacs for more CPU utitlization as variant - adding module https://github.com/xk/node-threads-a-gogo for simple unlock main loop and use all CPU power in one node process. (adding in node core space)

creationix commented 12 years ago

A cleaned up version of my "idle" event above could be implemented in node core and exposed as:


var work = [1,2,3,4,5,6];

// registerIdleWorker(inBetweenTicks, fn). Call fn repeatedly in a loop when
// the event loop is idle. If inBetweenTicks is > 0 then call fn that many
// times between each event even if the loop isn't idle. Auto deregister the
// worker when it returns a falsy value.
process.registerIdleWorker(0, function () {
  var next = work.shift();
  doSomethingExpensiveWith(next);
  return work.length;
});

I'm a little worried though that this will encourage people to do CPU intensive work in the main thead in JS. While it's probably the best way to do this, it's generally a bad idea to begin with.

qxfusion commented 12 years ago

to @creationix maybe more usable?

process.on("idle", function(next, ...) {
  forkcall(next)(/* async call in next idle time */)
})

1) shared variable between idle tick emit (as function arguments) 2) manage executing next call in idle time

Mithgol commented 12 years ago

In-memory cache example and especially EventEmitter example are good, but both do not rely on nextTick being event-blocking. They just need a tool to defer some code and thus they would happily live with the current nextTick implementation.

In what case do you need a proposed (event-blocking) process.nextTick(f) and f() won't suffice?

creationix commented 12 years ago

@Mithgol Like @isaacs summarized, the current nextTick semantics are too low-priority for the defer-till-caller-is-done case and too high-priority for the manual-cpu-task-slicing case. It's works mostly for both cases but not well for either.

I want well-defined and easy to understand semantics for nextTick that are fast and efficient. That is currently impossible.

In the current system nextTick is quite expensive and I/O or Timer events might slip in before your function gets called. The exact rules for when this happens are quite complicated in an effort to meet in the middle.

I don't just want to defer it till some time later, I want precise "queue this callback at the end of the nextTick queue, but before any other events". This makes for a much more efficient and easier to reason about system.

Mithgol commented 12 years ago

I understand now: you need process.nextTick() to have higher priority than other events (I/O and timers and idle workers) to use it exclusively for not-so-far deferring; and you need process.registerIdleWorker() to register a worker with lower priority than all of the above mentioned, so that the idle workers won't slow down that simple deferred code.

Thank you for the explanation.

Mithgol commented 12 years ago

Then it's like @qxfusion says: it would be beneficial for an idle worker to have a variable passed to its next iteration automagically. It's easier than having to dedicate a global variable (like work) to store that.

Here's my suggestion of the registerIdleWorker() interface:

/*
process.registerIdleWorker(inBetweenTicks, initialWork, fn)
*  Calls `fn` when the event loop is idle.
*  If `inBetweenTicks` > 0 then call `fn` that many times
   between events, even if the event loop isn't idle.
*  If `fn` returns a falsy value, unregister the worker.
   (No further calls even if `inBetweenTicks` > 0.)
*  If `fn` does not return a falsy value,
   pass the returned value to `fn` on next call.
*  The initial worker call is `fn(initialWork)`.
*/
process.registerIdleWorker(
   0,
   [1, 2, 3, 4, 5, 6],
   function (workload) {
      doSomethingExpensiveWith( workload.shift() );
      return workload;
   }
);

Usually that workload would be an array, and [] is already == false, so we won't need another exit flag.

qxfusion commented 12 years ago

to @Mithgol I'm think more usable create like this

/*
@param object/array taskListObject - source task list
@param function callback - callback function
if callback return "false" then stop idle executing list else continue executing by list
*/
process.addIdleWorker(taskListObject, callback) {
...
}

Example of use:

process.addIdleWorker(
[
 "console.log(100)",
 "console.log(200)"
],
function(entry) { return eval(entry); }
);

Display 2 message when Node detect IDLE state

Mithgol commented 12 years ago

Now, @qxfusion, you are going to require that a worker has a pre-defined task list (such as an array or an object to iterate over).

My proposal is more flexible, it also allows things like BinkD, some workers that just sit and wait (return true) and use their idle time to check something (like the contents of outgoing mail directories) until something interesting appears there and can be processed.

It also allows workers like Turing machines that have states and action tables with instructions more complex than array processing, they may do a step and return nextState until the state is zero and the worker's gone. (Actually the above is an over-simplification: real Turing machines also have their tapes, so their returned value would be closer to something like {state: '…', tape: '…'}.)

bjouhier commented 12 years ago

@creationix I fully agree with your statement that nextTick is currently too low priority for some use cases and too high priority for some others. What we need is two calls:

My only request is that the low priority one be more efficient than setTimeout(cb, 0) so that people who use nextTick today don't pay a penalty if they switch to this call. Another way to handle it would be to optimize setTimeout for the case where delay is 0.

Not sure that nextTick is the best name for the high priority one, though. afterTick would be more precise.

But nextTick is not a great name either for the low priority one because next can give the impression that it will really be fired next, that is before other events. laterTick would be more precise. setImmediate sounds like a good idea too.

But we can live with names that are a bit off. What matters is that we have two efficient calls for the two use cases that have been identified.

creationix commented 12 years ago

I think setImmediate is actually a good name for what it does. It's basically a setTimeout sans the timer. It's a normal event that goes as the end of the event queue just like a timer would.

Yes afterTick might be more precise, but I don't feel strongly about it. I'd be happy either way as long as it works in the new proposed way. I slightly prefer afterTick to nextTick, but I'll defer to @isaacs to decide that one.

coltrane commented 12 years ago

The promise library, Q, uses nextTick internally to schedule promise handlers to run asynchronously. This seems to be a somewhat unique use-case. When a long promise chain "unwinds" in Q we see an example where nextTick can be called "recursively" (though not infinitely). The usage is valid, and not well-served by an "idle" event. But neither does it require specific placement with respect to I/O events. Typically these bursts of nextTick activity are short-lived, and would only become substantial when the system is heavily loaded.

The proposed change will not break Q, but it will likely change the performance characteristics of existing systems that use Q under heavy load.

It's hard to know whether this will be an improvement or not. I suspect that it will improve (reduce) processing time for a single request; while moderately increasing avg latency (wait time in i/o queue) when the system is under high load. All-in-all, it may be a wash. But that's just a guess.

coltrane commented 12 years ago

Presently, node "ticks" twice-per-event-loop. Once in ev_prepare, and again in ev_check. The double-tick is what causes nextTick() to behave inconsistently.

It is my opinion that nextTick() should be kept as a sort of "low-priority heartbeat"; but it should be "fixed" by eliminating the ev_check Tick. Ticking only once, in ev_prepare, means that nextTick(fn) will always behave consistently, and in a way that can be easily described. This behavior is what most folks (outside "core") already expect of nextTick, and it's consistent with what the docs currently say about nextTick.

To address the needs Isaac, Tim, and others have outlined, I think we should introduce a new function that will work as Isaac has proposed. The name is mostly unimportant to me; though "nextTick", "thisTick", and "afterTick" all seem inappropriate for a function that really has nothing to do with "ticks". Something like process.asap(fn) seems more accurately descriptive, to me.

Finally, I support @creationix suggestion(s) for a super-low priority "idle" event, or a more full-featured "event source hook" that would support many different use-cases. However, I see these as separate from the current discussion. In the immediate, only two actions are required: (1) fix process.nextTick(), and (2) add process.asap() (or whatever it will be called).

xk commented 12 years ago

The "nextTick(f) vs SetTimeout(f, 0)" thread:

http://groups.google.com/group/nodejs/browse_thread/thread/b564ac42ac53e424/7b85530b465f578d

:-P

bjouhier commented 12 years ago

I digged into the mailing list and found the thread about nextTick and starvation. The funny part is that you clarified it for me. I thought it was Isaac but it was you!

https://groups.google.com/d/msg/nodejs/NBSZRyBl11E/KbBS1r6VSBQJ

Bruno

2012/6/1 Jorge Chamorro Bieling / @jorgechamorro < reply@reply.github.com

The "nextTick(f) vs SetTimeout(f, 0)" thread:

< http://groups.google.com/group/nodejs/browse_thread/thread/b564ac42ac53e424/7b85530b465f578d

:-P


Reply to this email directly or view it on GitHub: https://github.com/joyent/node/issues/3335#issuecomment-6064573

xk commented 12 years ago

And according to the current state of affairs, I was wrong... :-P Crazy, isn't it?

Cheers, Jorge.

networkimprov commented 12 years ago

Reiterating my note to the mailing list...

You can't discover the range and severity of problems a proposed change may cause by posting a description of it! The only way to find out is by releasing code. (This is a law of software.)

The way most established software shops address problems where different behavior is required is to implement it under a new API.

If the current behavior causes problems which sites cannot readily detect, or is triggered by third-party software which they cannot fix, then you must consider implementing the new behavior under the same API. Arguably, if they cannot detect it, it's not an issue for them.

In any case, semantics matter, and it's clear that the current API is not ideal for either behavior. This means it should be deprecated.

Since this is a pressing problem affecting major sites, I suggest you implement the new behavior under a new API in 0.7, ASAP.

ricardobeat commented 12 years ago

Rephrasing: +1 for deferring subsequent calls or a setImmediate, that will prevent people going back to setTimeout(f, 0), and satisfy these cases:

  1. split processing of an incoming data stream - parse one unit, nextTick() the next one to improve response times/CPU load
  2. better timer precision than setTimeout for specialized apps
ricardobeat commented 12 years ago

For the record, here is the current setImmediate spec: https://dvcs.w3.org/hg/webperf/raw-file/tip/specs/setImmediate/Overview.html

nextTick is even used as a an example for the non-blocking scheduling behaviour: http://web.archiveorange.com/archive/v/YQtl0DcosDLglHnIyBba

coltrane commented 12 years ago

@isaacs, on nodejs-dev, you suggested:

we can probably figure out a way to make it work such that it runs the first nextTick at the end of the current RTC, but subsequent ones at some point get deferred.

After thinking about it a bit, I am opposed to this point. Yes, handling subsequent invocations as a special case will prevent event loop starvation, but it means that nextTick() will behave differently depending on where it's called. This same kind of inconsistent behavior is what we're trying to fix. Any useful fix must give us consistent, reliable behavior, no matter where nextTick is called.

Rather than trying to create a single function with special cases to please everyone, I would propose:

  1. Let's fix nextTick so that it behaves consistently, without changing the existing semantics. This should be a trivial fix.
  2. Let's implement a new function (I proposed process.asap()), that provides the needed RTC semantics.
  3. If we don't like the nextTick semantics, or we deem it to be useless cruft in the API, then let's deprecate nextTick.

While I realize that names are not a top priority, they are still important -- especially as new developers move to node, and start to learn the API.

My proposed approach lets the names of the functions accurately reflect their actual behavior. "Next tick" clearly suggests that something (a "tick") will happen (or finish happening) between now and when the callback runs. "ASAP", as soon as possible, clearly indicates that it will preempt other things that may be pending, and run the callback as soon as the current stack has unravelled.

@isaacs, obviously, you (and others) are opposed to implementing RTC as a new api; you feel strongly that the functionality belongs in nextTick. My question is: why? What would be wrong with implementing RTC as a new api, as long as we also fix nextTick so it consistently works as it is documented today (i.e. as a 'low priority', "after queued events", scheduler)?

creationix commented 12 years ago

Let's define a tick as the execution stack that originates with a single event source (real or synthetic). Using this definition, nextTick (in the proposed semantics) will simply execute in the next tick after the current tick and after any already queued nextTick callbacks, but before any real event ticks (I/O or timers). It's still a tick and the name isn't misleading. It won't execute till the current tick has unwound the stack completely. ASAP by definition would be to call it synchronously, which isn't very useful.

bjouhier commented 12 years ago

@coltrane

I agree with you that subsequent nextTick (or whatever new callback replaces it) should not be deferred. The whole point about this API change is to turn a non-deterministic API into a deterministic one. Making it half-deterministic will probably solve some problems but break later.

isaacs commented 12 years ago

@coltrane The only case in which it'd be deferred is when you have a nextTick callback that calls process.nextTick, and it recurses 10,000 times, or some other suitably high number. 100,000 is probably enough that you'll never hit it in normal use. You can be sure that whenever you call nextTick, it'll run after the current RTC, if you're not doing pathological things.

It's not a matter of being half-deterministic.

The reason why I'm opposed to adding new api for this is that this is already what people are using nextTick for in the wild, and in node core, and it mostly works. Adding new API for this means deprecating nextTick loudly so that people are aware of the hazard, and I believe that'll dio more harm than good.

ricardobeat commented 12 years ago

@isaacs but what @coltrane proposed above is exactly keeping nextTick, while fixing it's behaviour and implementing a new method. That seems like an overall better solution.

The alternative will cause nested nextTick calls to spin up CPU until they hit an arbitrary limit; this is clearly not the user-intended behaviour, so what's the advantage in keeping it?

isaacs commented 12 years ago

This has landed in master, and it seems to be awesome.