strongloop / zone

Flow control and error handling for Node.js
http://strongloop.com/zone
Other
279 stars 14 forks source link

Align Semantics with Zone.js #9

Open btford opened 10 years ago

btford commented 10 years ago

This is a WIP; I'm still digesting the docs in this repo.

Background

I'm working on a distinct ES5 implementation of zones for browsers. It would be nice if the semantics for both libraries are similar. It also might be good to develop a spec so that someday JS knows how to do this without monkeypatching the universe.

This is written mostly for @piscisaureus to begin discussing the differences in our libraries.

Summary

Conceptually, there are two broad classes of uses for zones:

  1. You want to do something before/after all async tasks
  2. You want to do something once all async things are done for some context

    Zone.js

Zone.js is concerned with exposing a minimal set of hooks and the ability to compose behaviors.

Zone.js has a number of optional "zones" built on top of it to provide functionality some of the same functionality as zone:

Counting Zone:

zone.fork(Zone.countingZone).fork({
  onFlush: function () {
    console.log('no pending tasks!');
  }
});

zone

zone seems concerned with solving control-flow problems, resource management, and cleanup.

Zones are like asynchronous functions. From the outside perspective, they can return a single value or "throw" a single error.

It doesn't seem likezone has hooks for before/after tasks or for overriding functionality.

Zones

piscisaureus commented 10 years ago

What does it mean to enter/leave a zone?

  • The zone global becomes a reference to that zone
  • Newly created asynchronous actions (setTimeout, fs.stat) and resources (net.Server) are added to the zone.
  • You can't enter a zone arbitrarily. Of course when a zone is constructed you enter the zone. After that, from a zone you can only enter a parent/ancestor zone, A child zone can only be entered through a gate after construction.

Do zones "begin" or "end" ?

  • Yes
  • A zone begins immediately when you call `new Zone(function() { /* in the zone */ }).
  • A zone ends when all the asynchronous operations in a zone have ended and all resources have been closed. It always ends asynchronously (e.g. after the event loop 'tick' in which it was constructed)

"shared resources" within zones.

That's not a question :). Shared resources are not super common but sometimes they are inevitable. Some examples:

Zone.js allows you to attach arbitrary metadata to a zone by augmenting its properties. zone has Zone.data

Arbitrarily attaching metadata to an object that has class methods of its own is dangerous and makes it difficult to expand the API in the future. Node suffers from this problem. The "mongoose" library (ORM for MongoDB) has decided that any additional properties that are added to an EventEmitter are treated as fields that can be inserted in the database. Therefore we couldn't for example add EventEmitter.prototoype.listenerCount() so we had to expose it as EventEmitter.listenerCount(event_emitter).

Creating nested/child zones

What does it mean to nest a zone?

  • A parent zone does not end before all it's nested zones have ended
  • If an unhandled exception occurs within the nested zone, it is routed to the parent zone. If the parent doesn't handle it, it's routed to the parent's parent. etc.

What should this be called? Zone.js calls this fork.

  • When a zone is created with new Zone it always becomes a child of the active zone.
  • I don't mind making zone.fork an alias for that.
sam-github commented 10 years ago

zone.fork is likely in the Node context to draw analogy to fork(3), which would be misleading. Node already does that in child_process, and I have met people who have wrongly convinced themselves that child_process.fork() and exec to do the same thing as the system calls. If its entrenched, so be it, but if not, I'd suggest using some other name. A name like nest, which invites no analogy, might be an improvement.

btford commented 10 years ago

@piscisaureus – thanks for the writeup! I'll incorporate this into my original post when I get a moment.

@sam-github thanks for the feedback. I'm not married to fork. I agree fork is overloaded. nest might be good. I'm open to other suggestions as well.

btford commented 10 years ago

I've updated my OP. I think the biggest thing to resolve is the difference between target use cases as described in the summary.

piscisaureus commented 10 years ago

zone seems concerned with solving control-flow problems, resource management, and cleanup.

That's correct

It doesn't seem like zone has hooks for before/after tasks or for overriding functionality.

That's correct too. I'm not sure it's super necessary although I don't mind giving it a thought. However I like to keep in mind use cases when designing a library like this. What is the purpose of overwriting e.g. setTimeout().

Also node has a plethora of asynchronous functions which are not exposed on the global object, and people can implement their own in libraries. So tacking everying on the zone object seems a little unpractical, e.g. where would 'require('stream').Stream.prototype.write(buffer, cb)` go?

"After" tasks are available via setCallback/then/catch, but they run outside of the zone (in the parent zone). It is also possible for libraries to define cleanup handling (basically destructors) that run just before a zone exits, but that functionality is hidden currently. I prefer people not to f*ck with this feature and rely on default behaviour as much as possible - one of the reasons for writing the zone library is to make node more robust, e.g. you can throw without leaking and error handling is reasonable. When people write their own destructor and screw up this that safety is gone.

btford commented 10 years ago

So tacking everying on the zone object seems a little unpractical, e.g. where would 'require('stream').Stream.prototype.write(buffer, cb)` go?

Yeah, agreed. I don't think being able to patch every function makes sense. But I don't want it to be too cumbersome to patch things like console.log (like in Dart's implementation of Zones). I'll think on it.

piscisaureus commented 10 years ago

@btford

I think what we're discovering is that we are not solving exactly the same problem. At least, I am discovering it - you potentially already knew :)

Indeed what we are trying to do is come up with a concept that manages async "things" and resources and handles errors in a reasonable way. To do that we (as in the zone implementers) are monkey-patching all node APIs so their callbacks run in the correct zone and general behavior in a zone context is reasonable.

You are trying to create a lightweight execution context that makes it possible for the user to instrument/hook particular functions. In node people typically do this with contextify, which creates a new v8 context with a user-specified global object.

So for our use case it is important that zones have a natural end to them so they don't leak and, more importantly, resources attached to a zone don't leak.

What you need is a way to hook any function, which I can imagine is possible in the browser but really hard on the server.

What we both need is a way to maintain an asynchronous context.

So what we could do is limit the scope of this discussion for now, and only specify the things that are needed for both our use cases. Things that come to mind:

I can even imagine that I rename the core concept of Zone to something else (e.g. "Task" - what you have called "counting zone"), which is basically a superset of a Zone. A non-counting Zone would have valid use cases in a node context too.

kraman commented 10 years ago

Based on our discussion in person yesterday, I think we are pretty close to being compatible now. zone.js looks like it offers a subset for functionality in the zone library. Looking at the original use cases:

  • You want to do something before/after all async tasks

I split this out into #16 so we can discuss it there in more detail

  • You want to do something once all async things are done for some context

This would be the zone.then(...) function. This also implies that zones have a fixed lifetime and are alive as long as there is something to do within them or they are expecting an event to happen (I/O, timeout, EventEmitter/listener)