nodejs / node

Node.js JavaScript runtime βœ¨πŸ’πŸš€βœ¨
https://nodejs.org
Other
107.09k stars 29.33k forks source link

What will Domain be replaced with? #10843

Open isaacs opened 7 years ago

isaacs commented 7 years ago

I'm adding support for parallel tests to node-tap. https://github.com/tapjs/node-tap/issues/306

One sticky point is that, like most test frameworks, node-tap needs to be able to cleverly handle thrown errors and tie them to the test environment where the error was thrown from (especially since many existing assertion libs turn things like expect(foo).to.match(/regexp/) into a thrown error).

When tests were not run in parallel, there was no problem. Just walk down the tree of test.currentChild or whatever, and a single process.on('uncaughtException') is sufficient.

I'm using Domains to manage thrown errors in child tests, and it works great, but the bright red deprecation notice in the docs is scary.

What will it be replaced with? Will it be feature-equivalent? If so, is there any reason why the existing API has to be broken, and not re-implemented in terms of the new thing? Is there an expected time frame for the Domain api to be broken?

Thanks.

arciisine commented 7 years ago

While not official, something like async-listener or zonejs should generally be able to fit the bill, with some caveats.

jasnell commented 7 years ago

At this point it's not clear at all what will replace it. There is this zones proposal being discussed at the TC-39 level but it's not something that is far enough along for us to be able to make any reasonable decisions on. At the same time, if there is going to be something happening at the TC-39 level, it doesn't make much sense for us to come up with a likely incompatible replacement. So at this point, the deprecation/replacement of domain is in a holding pattern and will be for the foreseeable future.

Any discussion on a potential replacement needs to happen in the node-eps repository.

addaleax commented 7 years ago

My understanding was that async_hooks, when it’s ready, would be suitable for having domain re-implemented on top of it, with a similar or identical API?

/cc @nodejs/diagnostics

jasnell commented 7 years ago

That's a possibility but still runs up against the possibility of being inconsistent/incompatible with whatever language level construct is settled on by TC-39. It won't do us much good to put the effort into a domains v2 if we end up having to deprecate it in favor of a language level thing, etc

sam-github commented 7 years ago

Is there an expected time frame for the Domain api to be broken?

@isaacs I was under the impression that it is broken already. Besides https://github.com/nodejs/node/blob/master/doc/topics/domain-postmortem.md, promise users report problems with domains, and users of async/await have found them broken, too, IIRC.

If you don't use any of the broken bits, they seem to work fine, but its a bit of a mine field out there.

Qard commented 7 years ago

It's really been "broken" to some degree pretty much since day one. It just gets more and more broken as new language/runtime features are introduced and the necessary parts to support domains don't stay in sync. It never quite satisfied the needs and really just exposed a huge hole where people wanted features. That's a big part of why I pushed to start the diagnostics working group in the first place--to try to better deliver what was needed. The new 'async_wrap' stuff is definitely a step in the right direction, but I personally feel there's still a long way to go. I hope the zones effort grows into something more solid. TC39 stuff takes time though. πŸ˜•

isaacs commented 7 years ago

@Qard @sam-github It's not "broken already", it is what it is. I am using "broken" in a very technical non-normative sense here, meaning "change the API in a way that requires changes to the code using it".

You might not like the Domains API that exists today, and I'd probably agree with your criticisms of it. But it does a specific job very reliably, which I can't do any other way.

My fear is that, after spending 2 years in a "deprecated" state, someone will decide it's time to rip it out without giving me time to upgrade (or worse, without there even being anything to upgrade to). Could the documentation at least provide some clarity as to the requirements that have to be met before Domains will be removed? Or better yet, could we just deprecate the parts that are known to be harmful? (d.enter(), for example, should probably never be used in userland code, and it was a mistake to expose it.)

jasnell commented 7 years ago

There are no plans, nor has anyone suggested, nor would I expect anyone on @nodejs/ctc to suggest or even reasonably consider removing domain without a clear replacement in place. It is likely to be left alone as is without any modifications for the foreseeable future.

Fishrock123 commented 7 years ago

I'll try to read though this tomorrow, but the idea was to get async-hooks in in a state that would allow a user module to correctly replicate domains (and other variations of the async-error-handling idea), and then move towards removing it.

The resulting module would probably be more maintainable (perhaps even actually maintained!), and core's APIs for it a lot more platform-like and unoppinionated.

There are no plans, nor has anyone suggested, nor would I expect anyone on @nodejs/ctc to suggest or even reasonably consider removing domain without a clear replacement in place.

I would like to be clear that "the replacement" may not have the same level of API or do the same thing, but it will certainly be pushed towards allowing replication of the domain API as much as/if possible.

Qard commented 7 years ago

@isaacs I hear you on that, and I don't expect it'll just disappear immediately when something new comes along. It's possible the existing API might even just get ported to async-hooks in the future. It's really just the internal behaviour that's problematic: lack of support for promises and asyn/await, inconsistent timings between node versions, etc. My feeling is that the current deprecation is really more about it being potentially unreliable than there being any specific intention to fully replace it currently.

ORESoftware commented 7 years ago

I am in the same exact boat as @isaacs; every test runner in the Node.js ecosystem that is trying to parallelize tests will have the same exact problem - the problem being: the only reliable way to trap errors thrown in an arbitrary chunk of code* is domains.

As for native promises being broken with domains, you can try this:

const then = Promise.prototype.then;
Promise.prototype.then = function (fn1, fn2) {

  if (process.domain) {
    fn1 = fn1 && process.domain.bind(fn1);
    fn2 = fn2 && process.domain.bind(fn2);
  }
  return then.call(this, fn1, fn2);
};

Not super well-tested, just throwing that code out there.

Bluebird promises have had support for domains for more than 2 years TMK, so that's been a good thing for domain users for awhile.

*Where that code is user-provided or otherwise out of your control.

syrnick commented 7 years ago

Note on promises. Native v8 promises don't respect domains - the callbacks aren't called in the same domain. Polyfills work fine, because they are just library level constructs. Hence, domains mostly work with transpiled async/await and with most libraries that use promises if the global Promise is the polyfill, not the native Promise.

It's obviously a confusing state and using domains has all sorts of caveats and gotchas. Still, immensely useful creature.

misterdjules commented 7 years ago

@syrnick

Native v8 promises don't respect domains

Does https://github.com/nodejs/node/pull/12489 solve this problem for your use case? (note: it's only available in node 8.x).

syrnick commented 7 years ago

nice! I'll test that out on 8.x.

Trott commented 7 years ago

domain does have that funny "pending deprecation" notice. Now that async-hooks has arrived, are we anywhere close to actually giving domain a doc-only deprecation that doesn't have the "pending deprecation" ambiguity/confusion? Or not yet? I have no interest in pushing for a runtime deprecation or anything like that, but I would like the docs to be clear about the status.

BridgeAR commented 7 years ago

I think that is a good idea @Trott

owenallenaz commented 6 years ago

As I've been utilizing async and await, it seems apparent to me that with a proper execution of Promise chains, the domain module no longer seems necessary. Steps across the event loop can be maintained via the Promise chain, and thus errors can properly be try/catch no longer how deep and handled accordingly. Given correct implementation by Module implementors, which will take some time, the domain module no longer seems necessary. So far from my experimentation, I'm not aware of an edge case that I have used domains for which Promise chains do not handle and async and await provide the needed syntactic sugar to make it reasonable to utilize.

So then with that in mind, is the replacement, simply start using Promises.

misterdjules commented 6 years ago

@owenallenaz Just to make sure I understand what you suggest: do you mean that errors thrown from an async task not originally scheduled by a promise should be "wrapped" by a promise, such as for instance using the following to handle errors thrown from setTimeout callbacks?

const somePromise = new Promise((resolve, reject) => {
  setTimeout(function () {
    try {
      throw new Error('boom!');
    } catch (e) {
      reject(e)
    }
  })
})

async function mainAsync() {
  try {
    await somePromise;
  } catch (err) {
    console.error(err);
  }
}

mainAsync();
owenallenaz commented 6 years ago

With callback semantics we're currently forced to deal with errors in two ways. The first is via the callback and the second is thrown errors caught by domain. I spun up a simple site in a gist to showcase the problem.

https://gist.github.com/owenallenaz/545bacc327ff1969c56d69e1afb4cfa7

Visit / to see a list of the urls.

In CPS style if the error is cb'd we're fine. If there is a thrown error, nested within the CPS chain there is simply no way to handle that error other than a domain. With Promise and async/await, we can have one form of error handling, try/catch, and it is able to handle the error no matter how it came to be, nor how deeply nested. Looking at the various promise urls helps to showcase that capability. This capability requires the whole nested chain to be properly executed. But if module developers ensure that their calls are properly handling promise chains, then they should bubble up until someone applys a then in traditional syntax or a try/catch via await syntax.

Trott commented 5 years ago

My impression is that the answer to the "when will domains be removed?" question is "quite possibly never." And the answer to the "what will domains be replaced with?" question is "kinda sorta async_hooks but mostly kinda sorta." While perhaps not the most satisfying answers, I wonder if that's basically what we have, and if so, perhaps this can/should be closed?

misterdjules commented 5 years ago

@Trott I feel the current state of domain does not send a clear message to users though: on one hand, domain is deprecated, and the documentation hints at "a replacement API", but it doesn't seem like any replacement has been considered.

Some users have been very vocal about the confusion that this message generates, and it seems that we could at least provide user-land alternatives that would allow for a reasonable migration path for users while allowing core to fully deprecate domain.

I've submitted a PR that enables user-land implementations of domain at https://github.com/nodejs/node/pull/24279 (this is a work in progress, but I'd like to gather initial feedback on whether core would be open to merge something similar), and worked on a proof of concept of a user-land domain implementation that relies on those changes at https://github.com/misterdjules/domaine.

Any thoughts on that?

devsnek commented 5 years ago

lots of replacements have been considered, just nothing has been very compelling (imo because use case itself is uncompelling). I think the best direction would be to suggest that users migrate away from the domain model, and properly handle errors.

llm-labs-co commented 5 years ago

@devsnek how do you provide context for logs of your applications/servers?

devsnek commented 5 years ago

@AiDirex you could use async_hooks for that. (check out an example here). the thing that has been the problem isn't the scoped context, it's been the error handling.

llm-labs-co commented 5 years ago

@devsnek Thank you, but I know that modules like cls-hooked exist that use async-hooks to solve the issue of context. The problem is that some modules aren't compatible with async-hooks yet and their maintainers won't fix the compatibility issues because async-hooks module is still in an experimental state.

devsnek commented 5 years ago

@AiDirex out of curiosity, can you point me to a module that is incompatible with async_hooks? every async primitive in node (via js or c++) should be tracking async_hook context.

llm-labs-co commented 5 years ago

@devsnek one is bluebird which lots of other modules depend on it.

isaacs commented 4 years ago

Update on this: I've been using async-hook-domain for the last several versions of tap, and it mostly works.

However, in order to handle errors, it has to do some unsavory things, like patch process._fatalException. And due to https://github.com/nodejs/node/issues/26794, it fails to get the correct continuation flow when multiple promises are in use.

Domains were deprecated in node v4, after all. As of v13, still no answer as to what to use instead.

devsnek commented 4 years ago

26794 is unfortunate, but when it is fixed it should be possible to track promise rejection that originates from the context of your choosing. As to a replacement for domains, I don't think there will be one magic module to replace everything domain did. In particular, domain tied async context and error handling together, and it became increasingly contrived and complex trying to follow how js worked. Async context tracking is fairly inherent to node core, and we have async_hooks for that. Error handling though, is a very nuanced problem. People don't even agree on what constitutes exceptional behaviour (for example, the argument on unhandled rejection tracking), so leaving that to userland seems appropriate to me. Your async-hook-domain module feels to me like a good sign that we're on the right path. There are definitely kinks to be worked out, but you're tracking errors across async contexts, on your own terms.

jasnell commented 3 years ago

Going to transition this thread into a discussion at this point. Scratch that, looks like the option to convert an issue into a discussion has been temporarily disabled.

ORESoftware commented 11 months ago

there are things that @isaacs and I disagree on, but I agree 100% with his comments on this thread. Domains are a political/religious issue because most people don't seem to really understand how they work. I dont see why maintaining them in core would be any harder than maintaining the equivalent functionality with async-hooks - however there is no equivalent functionality with async-hooks? Some might be ok with having no such equivalent functionality - but many of us are not.

isaacs commented 11 months ago

@ORESoftware i did end up scratching my own itch with async-hook-domain https://www.npmjs.com/package/async-hook-domain

If you need the error handling behavior of domains, then it's worth taking a look at.

ORESoftware commented 11 months ago

@isaacs thanks - I learned a lot about domains by implementing npm package "domain-haven" -> before version 14, a lot of times, I would be touching these a lot:

process.on('unhandledRejection', (e, p) => {
   // reference p.domain if it exists
});

process.on('uncaughtException', (e) => {
  // reference process.domain if it exists
});

questions about your library belong there but someone will probably use this thread someday - and I could probably poke through your codebase to find out - but does your library need to reference the above global handlers? It certainly would be nice if things were implemented in core (etc etc) so that whatever is "acting like domain" is consistent enough to not need to deal with global things like those handlers.

isaacs commented 11 months ago

@ORESoftware It uses the newer process.setUncaughtExceptionCaptureCallback method to handle errors and the process.on('uncaughtExceptionMonitor') event to know when an error has occurred and look up the appropriate handler from the stack of Domains for the given async context. There is necessarily some "global" hooking that needs to be done, because errors that are not handled go unhandled to the global context, that's what "unhandled" means. But at least, the more recent interfaces are cleaner and less prone to interference from code that is doing other things.