promises and post-mortem debugging

misterdjules commented 8 years ago

A PR that aims at adding a promises-based API to most node core APIs was recently submitted to nodejs/node.

I have expressed my concerns regarding the impact of such a change on post-mortem debugging, and a separate issue was created by @Raynos to further discuss these concerns.

My concerns are more questions that I asked because I felt that the way promises handle errors is incompatible, at least in some ways, with post-mortem debugging. So far my investigations only confirmed these concerns, but I'm not done with them. One part of these investigations is also to find a solution to the problems that I found so far.

The discussion in nodejs/node has been productive so far, but I represent only one post-mortem debugging user/maintainer. I feel it would help if some people from this working group would share their opinion on these concerns with the rest of the group here in this issue, if only to confirm or infirm that they have the same questions/reserves as I do about the impact of https://github.com/nodejs/node/pull/5020 on post-mortem debugging. For instance, if some of you already worked on that problem and did a similar investigation, I would be interested if you could share your results.

In any case, I will share my findings with the group as soon as I have some more time to continue my investigations.

Thank you!

davepacheco commented 8 years ago

For what it's worth, I have all the same concerns and reservations, but nothing additional to add to the discussion that you didn't already mention. Thanks for weighing in on the PR.

mikeal commented 8 years ago

To what extent are these incompatibilities an implementation detail of V8's Promises vs. an issue with any Promise implementation based on the spec in ES2015?

I ask because we have to approach fixing them very different depending.

misterdjules commented 8 years ago

@mikeal That's a great question. I would think that we would try to come up with a solution similar to what has been done for --abort-on-uncaught-exception. That is: it would be behind a flag, and should be considered as a debugging tool specific to the Node.js runtime/platform.

I don't think aborting on an uncaught exception when passing --abort-on-uncaught-exception on the command line conforms to the ES 2015 spec, in the sense that I'm not sure if it's in the scope of the spec. If that's a valid interpretation of the spec, I would think the same reasoning could apply to a similar solution to this specific problem.

misterdjules commented 8 years ago

Also, see https://github.com/nodejs/node/issues/5084#issuecomment-180712712 for a quick experiment that I've done.

mikeal commented 8 years ago

Maybe we should ask @domenic about this.

domenic commented 8 years ago

What is the exact question?

jclulow commented 8 years ago

I suspect it mostly boils down to this: Given the way that promises are currently specified, is it possible for the interpreter to know a priori whether or not a thrown exception will be handled as a rejection later?

If it is possible to know, we can abort the process at the correct moment because we know the thrown exception will not be handled as a rejection at a later time. This preserves the state of the process at the moment the fault occurs; stack arguments will be available, and the heap will have been preserved in its entirety by the operating system.

If it is not possible to know for all cases, then the interpreter must necessarily unwind the stack and continue executing until a subsequent determination can be made as to whether the rejection is unhandled. If this is true, then promises are anathema to post mortem debugging -- the model prescribes the destruction of the very state we are trying so hard to preserve for later study.

domenic commented 8 years ago

I suspect it mostly boils down to this: Given the way that promises are currently specified, is it possible for the interpreter to know a priori whether or not a thrown exception will be handled as a rejection later?

No. Just like there's no way for the "interpreter" to know whether or not a thrown exception will be catch-ed later.

The entire point of promises is to be able to handle errors, similar to try/catch. If I understand you correctly, post-mortem debugging is anathema to the JavaScript language and its try/catch feature (as well as the promises feature that depends builds on that). A better approach would be to run a source transform to replace throw (as well as promise reject calls) with process.abort(), or use a language which does not have catchable exceptions.

jclulow commented 8 years ago

No. Just like there's no way for the "interpreter" to know whether or not a thrown exception will be catch-ed later.

This is manifestly false. Whether a throw will be caught by a try/catch or not depends entirely on whether the interpreter is handling code within a (potentially nested) try block or not. The call frame itself can make this coarse grained information available, and the interpreter can walk stack frames without dismantling them.

If nothing else, there is proof by construction: the Node --abort-on-uncaught-exception flag does this today.

The entire point of promises is to be able to handle errors, similar to try/catch. If I understand you correctly, post-mortem debugging is anathema to the JavaScript language and its try/catch feature (as well as the promises feature that depends builds on that).

It's certainly possible to make software less debuggable with blanket usage of try and catch, but you at least have to opt in to doing so. One of the more unfortunate aspects of Javascript is that catch is a binary construct: either you catch everything, or you catch nothing. On this basis, we tend to recommend that you avoid throwing or catching at all. Thrown exceptions should be left to serious programmer errors, where the only recourse is to abort an apparently logically inconsistent program.

Promises, like catch, do not appear to afford the programmer the ability to catch only a subset of expected exceptions. They also apparently lack any way to turn off the indiscriminate swallowing of all classes of exceptions. That is what makes them undesirable, at least to those of us working on systems software.

A better approach would be to run a source transform to replace throw (as well as promise reject calls) with process.abort(), or use a language which does not have catchable exceptions.

Even if reaching into arbitrary software and replacing random statements was a good idea, it would not work for a ReferenceError or a TypeError, or any other fault that arises implicitly due to an errant program. It would be similarly unhelpful with exceptions thrown from functions with a native implementation; e.g., JSON.parse().

This is not about aborting the process every time something is thrown, but rather when something is thrown but not handled.

domenic commented 8 years ago

Given the tone of this conversation, I do not feel comfortable continuing to participate in it, and am unsubscribing from this thread. I would ask that do you do not mention me again in it.

chrisdickinson commented 8 years ago

@jclulow: To answer your original question — "Given the way that promises are currently specified, is it possible for the interpreter to know a priori whether or not a thrown exception will be handled as a rejection later?" The answer is no — or at least, not in a way that is compatible with the behavior you're looking for. Rejections are not treated like panics because it is possible that the rejection will become handled later in execution, possibly on a subsequent tick. Note, this is not the same thing as "on error resume" behavior, like domains enable — a rejection cannot "jump out of" the dependency graph of async ops like domains can.

This is not to say promises are anathema to post mortem debugging: as Domenic suggested, relying on throws to panic means that authors need to avoid try and catch where possible. While this group recommends that approach in order to make existing post-mortem tools work optimally, others find value in using try/catch (and promises) for exceptional control flow; and the introduction of constructs like process.on('uncaughtException'), and domains themselves show that the "on error panic" approach is not universally acceptable for Node users — some users and applications will want to make different tradeoffs than systems programmers.

Given that, even absent promises, our existing post-mortem tools rely on specific approaches to writing JS, and given that it's unlikely that generalizing a single approach to writing JS will work for all Node users, it seems to me that finding a solution that addresses the majority of both audience's concerns would be a good outcome to work towards.

For example: the post-mortem working group currently requires panic behavior to happen at the apex of the stack, which means that any approach to writing JS that involves catch clauses reduces the efficacy of the post mortem tools we have available. This happens even if the error is rethrown from within the handler immediately. @misterdjules has noted that the reason for this invariant is the need to reduce the number of confounding factors when examining the state of the program at crash — to be able to rely on the fact that the content of the heap and stack sections did not change between entering the error state and the process exiting.

However, it seems like the most valuable ephemeral information is the stack information — the functions, receivers, and arguments that get destroyed by unwinding the stack. This is also the most difficult requirement imposed by our post-mortem tools — it means that Node users that choose to use catch or promises are opting out of post-mortem techniques, and also requires that constructs like domains are called in a specific manner that can be awkward to ensure.

The Error.captureStackTrace API is backed by a stack frame iterator that has access to all of the information discarded by stack unwinding, and it captures data at precisely the point we need it to. Right now it doesn't record all of the information we're interested in — namely, parameter values. I understand that it doesn't enforce the ideal assertion — that no operations happened between unexpected state and abort — but it brings us from "no information" on catch and rethrow (or promises) to "useful partial information," and allows the post-mortem WG to relax the requirement that panic must happen at top of stack somewhat.

With regards to promises, because the unhandledRejection hook is exposed at application level, users who wish to treat unhandled rejections as panics are free to do so, and may abort providing a useful error object on-stack for post-mortem debugging.

davepacheco commented 8 years ago

This is not to say promises are anathema to post mortem debugging: as Domenic suggested, relying on throws to panic means that authors need to avoid try and catch where possible.

This point has been repeated a few times in a few threads, so it's worth clarifying. It seems to imply that developers that care about postmortem debugging have to take special care in their code (and, by logical extension, the modules that they use). But the vast majority of operational errors in Node and Node modules are already asynchronous and so are not thrown and not caught with try/catch anyway. As a result, most of the time, things work regardless of whether a module author considered postmortem support in the way they designed an interface or whether a program author considered that when they added a new module dependency.

Unlike try/catch, promises are used for handling asynchronous operational errors. That's why a promises-based core API is significantly more problematic for postmortem debugging than try/catch is.

This is also the most difficult requirement imposed by our post-mortem tools ... For example: the post-mortem working group currently requires panic behavior to happen at the apex of the stack, which means that any approach to writing JS that involves catch clauses reduces the efficacy of the post mortem tools we have available.

I think it's fairer to say that these requirements derive from the unfortunate behavior of software that sometimes fails in ways that requires that data to understand it. For the reasons I mentioned above, I think in practice the way "catch" is used is not a major problem.

chrisdickinson commented 8 years ago

@davepacheco:

This point has been repeated a few times in a few threads, so it's worth clarifying. It seems to imply that developers that care about postmortem debugging have to take special care in their code (and, by logical extension, the modules that they use). But the vast majority of operational errors in Node and Node modules are already asynchronous and so are not thrown and not caught with try/catch anyway. As a result, most of the time, things work regardless of whether a module author considered postmortem support in the way they designed an interface or whether a program author considered that when they added a new module dependency.

Thanks for your response. This is a fair point: at the moment most errors generated by the core API are propagated through means other than throwing. However, I don't believe this precludes users from having to take special care in order to make post-mortem debugging viable for their programs.

When I say that special care has to be taken, I am referring to the recommendations on the Joyent error document as well as the recent recommendations from the Node error symposium, which stipulate that users should avoid using catch wherever possible in order to avoid losing stack information. In the case of promises, their existing use in the ecosystem would seem to preclude the application of post-mortem tools to those packages (& users of those packages.) Whether Core provides a promise-based API or not, this is a problem with the current state of post-mortem WG, if indeed we wish all Node users to be post-mortem WG users, as suggested elsewhere.

While I understand that core providing a promise-based API highlights this problem, I don't think it introduces it, and indeed it gives us an opportunity to find workable solutions for users that prefer to use catch + rethrow as well as users that prefer promises. Note that this isn't a suggestion to change the practices involved with callbacks — they remain as they are, and the existing tools and techniques used to debug them still work. This is a purely additive suggestion — that we start to look into instrumenting error objects more at stack trace capture time — that makes all programs, callback-based or not, produce more detailed information in core dumps. In other words, this is an opportunity to make the post-mortem debugging story a little better for all Node users, not just the ones that are able (or willing) to make the tradeoffs inherent in the recommendations we've given them thus far.

stevemao commented 8 years ago

Related issue about error handling https://github.com/mcollina/split2/pull/8

vkurchatkin commented 8 years ago

I suspect it mostly boils down to this: Given the way that promises are currently specified, is it possible for the interpreter to know a priori whether or not a thrown exception will be handled as a rejection later?

The answer is yes, it's possible and v8 is capable of doing it, at least in debug mode.

chrisdickinson commented 8 years ago

@vkurchatkin:

The answer is yes, it's possible and v8 is capable of doing it, at least in debug mode.

Hm, I don't think it's possible to know at error-generation time — with the stack that led to error generation — that the promise will be unhandled.

For example:

const A = createPromiseSomehow()
const B = A.then(handlerB)
const C = B.then(handlerC)

If handlerB throws an exception, it will call PromiseReject on the promise. At this point, PromiseDone is called, and since there are pending tasks (B.then(handlerC)), PromiseEnqueue is called. Once the microtask fires, it will run PromiseIdRejectHandler which re-throws the error. This causes C to trigger an unhandledRejection event, however the stack leading to the original error has already been unwound. That said, the information that stack contained could be made available on the err object by decorating that object with more (private-to-VM) information about the stack frames it must iterate to generate the err.stack property.

vkurchatkin commented 8 years ago

@chrisdickinson

If B rejects its possible to traverse it's children and find out that there is no catch in the chain, so we can immediately abort

chrisdickinson commented 8 years ago

@vkurchatkin: An excellent point — right now all handlers A, B, C have a catch, it's just the PromiseIdRejectHandler (Edit: but in a patch, we could check for that.) The only other case it doesn't handle is an "eventual" handle of C, where elsewhere code calls C.catch after the initial rejection event, though I think that's probably not a deal breaker since application-level users can decide to disable that behavior explicitly. Perhaps we should talk to someone from the V8 team that focuses on promises about the feasibility of a patch?

vkurchatkin commented 8 years ago

The only other case it doesn't handle is an "eventual" handle of C, where elsewhere code calls C.catch after the initial rejection event,

Well that's obviously not possible if you choose to abort on unhandled promise

An excellent point — right now all handlers A, B, C have a catch, it's just the PromiseIdRejectHandler (Edit: but in a patch, we could check for that)

This is actually already implemented in PromiseHasUserDefinedRejectHandlerRecursive function: https://github.com/v8/v8/blob/master/src/js/promise.js#L406

chrisdickinson commented 8 years ago

@vkurchatkin:

Well that's obviously not possible if you choose to abort on unhandled promise

Yep! I agree we can safely ignore that case.

This is actually already implemented in PromiseHasUserDefinedRejectHandlerRecursive function: https://github.com/v8/v8/blob/master/src/js/promise.js#L406

Oh wow, I totally missed this. That's definitely useful. A patch could make it so that the perf hit for checking is only imposed on programs that are run with --abort-on-unhandled-rejection. That's exciting!

groundwater commented 8 years ago

Just want to add that regardless of whether V8 can maintain the stack on an unhandled Promise rejection, promise rejection is meant as a control flow. Many promise APIs reject promises for operational errors, meaning errors must be inspected, and unknown errors rethrown. By the time we know if our error is unexpected, the stack is unwound.

If promises are rejecting for well-defined reasons, such as a resource not existing or a timeout occurring we want to separate those cases from ReferenceErrors where the program is just broken.

jclulow commented 8 years ago

If promises are rejecting for well-defined reasons, such as a resource not existing or a timeout occurring we want to separate those cases from ReferenceErrors where the program is just broken.

Right. This is important, and I think speaks to a critical flaw in the model: implicitly thrown exceptions (ReferenceError, etc) are shoe-horned into the same error-handling channel as any other source of rejection.

Promises would ideally not have a set of edge cases or weird behaviour when somebody enables whatever becomes the promise equivalent of --abort-on-uncaught-exception. One of the great parts about --abort-on-uncaught-exception is that it only aborts when Node would have otherwise exited in error. It doesn't exit because a callback was called with an Error as the first argument, rather only when an exception is thrown but not caught.

chrisdickinson commented 8 years ago

@groundwater A good point, and many promise-based programs will be doing this sort of "does this error apply to me?" check-and-rethrow. There are a couple of avenues of investigation, here:

Introduce an --abort-on-unhandled-rejection flag to V8, one that preserves stack when no user-defined handlers have been added.
Look into recording frame parameter info onto error objects in a private-to-VM fashion. At the very least this offsets the costs of receiving a core with an unwound stack but with an error object present on-stack — a case that, at least anecdotally, I've run into in the past even in callback heavy codebases.
On the language-level track, look into what it would take to add checked .catch handlers to promises, as an analogue (or as a part of) a proposal to add checked catch to the language to address this concern in the long run.

chrisdickinson commented 8 years ago

@jclulow:

This is a lever with which to turn the language. We can't change promises' existing behavior with regard to catching ReferenceErrors, since userland code relies on it. We can't prevent Node users from choosing to use them — they are a lot of good reasons to use them. Leaving them out of the Node core API won't fix the problem — some subset of users already want to use promises, leaving promises out won't change that.

However, we can press for better support from the language for being able to tell if a thrown error will simply be rethrown — both for catch clauses as well as for Promise .catch() methods. This should make the cores they produce suitable for use with the existing post-mortem tools. Leaving objections with the model aside for the moment, we can make it so that another package author's choice to use promises somewhere in your dependency chain doesn't unduly affect your ability to debug your application.

While we do that, we can improve the story for short stacks by recording more information on the Error object, making the existing cases where cores are produced with partially unwound stacks more workable — if not ideal — and making existing promise use less of an issue.

From what I've heard it seems like this should address most of the concerns voiced by the post-mortem WG. I am curious if you would find this acceptable — it is a compromise, but it is a compromise that expands the post-mortem story to include promises which means a larger user base is covered, even if not ideally.

benjamingr commented 8 years ago

@jclulow

Right. This is important, and I think speaks to a critical flaw in the model: implicitly thrown exceptions (ReferenceError, etc) are shoe-horned into the same error-handling channel as any other source of rejection.

Right, but those are still reported - this is exactly why we did https://github.com/nodejs/node/issues/256 you simply don't catch ReferenceErrors in the handler and then report them in the global hook.

I don't understand something, the use cases post-mortem debugging applies to like: "out of memory", "C++ module did something bad" or "I'm out of DB handles because something leaks" are all things you can inspect synchronously - they get thrown outside the chain and you can keep using the same techniques to debug them.

The other stuff - promises make things a lot easier to debug. You have throw safety, cascading error management like in synchronous code and when async/await lands it'll look synchronous too. You can have a good stack trace where the promise is thrown (userland libraries are better at this ATM though) and you can debug your code much like you would debug synchronous code.

Basically, every asynchronous case would have this problem with callbacks anyway and every synchronous case would not be affected.

davepacheco commented 8 years ago

@chrisdickinson

Thanks for your response. This is a fair point: at the moment most errors generated by the core API are propagated through means other than throwing. However, I don't believe this precludes users from having to take special care in order to make post-mortem debugging viable for their programs.

When I say that special care has to be taken, I am referring to the recommendations on the Joyent error document as well as the recent recommendations from the Node error symposium, which stipulate that users should avoid using catch wherever possible in order to avoid losing stack information.

It feels like my point has been acknowledged, but ignored. That document you linked to says:

In fact, the only commonly-used case where you'd use try/catch is JSON.parse and other user-input validation functions. ... The general rule is that a function may deliver operational errors synchronously (e.g., by throwing) or asynchronously (by passing them to a callback or emitting error on an EventEmitter), but it should not do both. This way, a user can handle errors by either handling them in the callback or using try/catch, but they never need to do both. ... In general, using throw and expecting a caller to use try/catch is pretty rare, since it's not common in Node.js for synchronous functions to have operational errors. (The main exception are user input validation functions like JSON.parse.)

I don't see anything that says that one should never use try/catch on the grounds that it interferes with postmortem debugging, because indeed: using try/catch to handle operational errors has no impact on postmortem debugging. Many of us have argued against using try/catch for handling programmer errors. But you don't usually have to worry about how modules handle programmer errors because modules generally don't attempt to deal with programmer errors at all, let alone those that arise from outside themselves. The results of the recent symposium do suggest avoiding try/catch -- but again, it's clearly talking about operational errors.

In the case of promises, their existing use in the ecosystem would seem to preclude the application of post-mortem tools to those packages (& users of those packages.) Whether Core provides a promise-based API or not, this is a problem with the current state of post-mortem WG, if indeed we wish all Node users to be post-mortem WG users, as suggested elsewhere.

The working group surely cannot prevent people from making design choices in their own modules that interfere with their ability to debug their own software postmortem (nor is that part of their charge, except perhaps to educate people on the issue). There's a big difference between that and landing changes to Node core APIs which, if used (and quite possibly used without understanding the impact), interfere with postmortem debugging.

Here and elsewhere, a false dichotomy is presented between accepting the PR and opposing a complete solution for postmortem debugging with promises. I don't believe anybody opposes such a solution, but many of us believe that before a new public API is created in platform, particularly one for which API stability is a stated goal, the implications of the API should be well-understood, and the known problems with it should be worked out. Landing an API is not the way to solve problems with it.

davepacheco commented 8 years ago

@benjamingr

I don't understand something, the use cases post-mortem debugging applies to like: "out of memory", "C++ module did something bad" or "I'm out of DB handles because something leaks" are all things you can inspect synchronously - they get thrown outside the chain and you can keep using the same techniques to debug them. ... Basically, every asynchronous case would have this problem with callbacks anyway and every synchronous case would not be affected.

I'm not sure what you mean by by this -- what's "outside the chain" and what are "the same techniques"?
The use-cases are much broader than the ones you listed. A prototypical use-case for postmortem debugging would be: during an http 'request' handler that calls through several different modules while handling the request, the program ends up calling a JavaScript function that attempts to access property "foo" of a value that's undefined. The program crashes and dumps core. The core file allows the developer to see all of the information about the http request (method, url, headers, bytes read), all of the handlers on the stack, the arguments to each of these handlers, the "this" argument to each function on the stack, and so on. The value is that sometimes, the stack trace just isn't enough, and this other information helps nail the problem. Besides that, all this works even if the problem is not reproducible even once, and it does not require you to stop the program in a debugger (so you can use this in production).

chrisdickinson commented 8 years ago

@davepacheco:

Thanks for your response. I'd like to highlight a couple points:

But you don't usually have to worry about how modules handle programmer errors because modules generally don't attempt to deal with programmer errors at all, let alone those that arise from outside themselves.

The working group surely cannot prevent people from making design choices in their own modules that interfere with their ability to debug their own software postmortem (nor is that part of their charge, except perhaps to educate people on the issue).

I think the above points represent where we start to diverge. Specifically, I believe that the primary consumers of PMWG tooling are a subset of application developers — folks working directly on or in service of applications. Not all application developers consume PMWG tooling — indeed, though I have used the PMWG tooling in the past, in my current position I do not. Using myself as an example, as a member of the set of application developers, this would imply that the set of all application devs is not the set of all PMWG tooling consumers. PMWG tooling consumers use packages from the larger ecosystem — that is, PMWG users use packages authored by non-PMWG-using authors.

Some subset of non-PMWG-using package authors prefer to use promises. Those promises may cross package boundaries — that is, they may be returned to calling packages. This is an existing state in the ecosystem — see knex or sequelize for examples of this. By using and returning promises, these packages do deal with programmer errors. PMWG consumers may indirectly consume these packages as part of their application's dependency chain. It is likely that more package authors will start using promises as async/await lands sometime this year. Thus, the promises in the ecosystem already pose a problem for PMWG tooling consumers — this is a problem that exists now, and is likely to get worse over time if not addressed. As you said, I do not think that this group can prevent this problem via recommendation.

In that respect, I believe that introducing a promise API to core is a separate concern from the existing problem, and is unlikely to affect the presence or absence of the problem. I realize that not everyone agrees with me on this point — but if we agree on the prior chain of reasoning, that seems to imply that the post mortem WG will need to address promise use in tooling sooner rather than later. The timeline on the promise API is long, with go/no-go checkpoints along the way — nearly a year before it is officially supported and unflagged — which means that a solution can be developed in the interim. We can use the information we gather while this API is flagged as empirical evidence of problems with promises in practice, to be given to TC39, to help move that spec in a direction that is more amenable to PMWG.

Again, the API proposed does not expose the promise API in a cavalier fashion — it is a measured, long-term feature, with two major versions before official support + a flag for initial use, which gives us two major versions worth of information and avoids landing it in LTS until we know that the solution is tenable. We can and should solve the interoperation with promises and PMWG tooling, but as that problem already exists, I don't see why we can't do this in parallel, while using information from the flagged experiment to inform our research (as well as the standards bodies that can affect lasting change.)

davepacheco commented 8 years ago

@chrisdickinson Indeed, not all Node users use postmortem debugging tooling. We agree on that. I think there's agreement as well that some form of postmortem debugging support is something the Node community wants to keep.

Thus, the promises in the ecosystem already pose a problem for PMWG tooling consumers — this is a problem that exists now, and is likely to get worse over time if not addressed.... In that respect, I believe that introducing a promise API to core is a separate concern from the existing problem, and is unlikely to affect the presence or absence of the problem.

I believe several of us have repeatedly reported that we have not experienced this issue at all. That's not to say that it doesn't exist, but it appears to not yet be a widespread issue. (I'd welcome data points from other members of the working group on this.) Do you believe that the presence of a promise-based core API in Node will not contribute to the more widespread of use promises, and particularly in situations where module authors may have used the API without even knowing that it would impact other users' ability to use postmortem debugging?

Raynos commented 8 years ago

@chrisdickson

Promises in the ecosystem create zero problems for enterprises.

If you have more then 200 engineers doing nodejs at a large company then you simply have an npm module whitelist and you never whitelist promises.

Once promises are in core we need to introduce excessive tooling to once again ban them and explain why this feature of node core should not be used. This will be the first feature in node core that cannot be used.

Let me repeat that, you are introducing the first feature that can not be used (ignore domains, that is officially deprecated by core and community)

Not only that but core and npm are supposed to be different. Core has always had a history of being lean (a small core), performant (everything has to be fast) and debuggable (amazing post mortem tooling out of the box)

It's almost like nodejs was run by a kernel company.

I would really really regret it if we added things to core that do not align, promises are not lean (two choices), not performant and not debuggable. I mean it's literally be definition not nodejs.

chrisdickinson commented 8 years ago

@davepacheco:

Do you believe that the presence of a promise-based core API in Node will not contribute to the more widespread of use promises, and particularly in situations where module authors may have used the API without even knowing that it would impact other users' ability to use postmortem debugging?

I do not believe landing a flagged promises implementation will have a noticeable effect on the growth rate of this problem. My suspicion is that the issue is more likely to be forced by async/await than core promises. However, I could be wrong! It may be that the set of users that want to use PMWG tooling and the set of users that want to use promises are entirely disjoint, in which case, there should be no problem. PMWG tooling users (and callback users) can continue to use their preferred async primitives, while promise users can use theirs — the solutions can coexist peacefully.

@raynos:

To disable promises, you would be able to set global.Promise = null (or a suitable constructor that throws an exception stating that promise use is banned.) It shouldn't pose a problem, especially with the preload modules flag — you should be able to delete the promise reference before anyone else is able to. This could even be a flag that Node provides for application authors that want to explicitly disable promises.

mikeal commented 8 years ago

@raynos also, we already ship promises w/ Node.js and you have to disable them now the same way @chrisdickinson just suggested if you want to keep them out of your code base.

kriskowal commented 8 years ago

@Raynos Let’s not forget EventEmitter. Too late to back that out of core, so we have to ban that in tooling too.

Raynos commented 8 years ago

@mikeal @chrisdickson

I think promises already exist is not a fair comparison to adding new promise based APIs. Node simply ships v8 unmodified.

Function.prototype.bind is terrible for both performance and debugging reasons but I would never ask node core to remove it.

I would never ask node core to remove global.Promise.

The existence of promises in v8 has nothing to do with my previous comment and is orthogonal.

chrisdickinson commented 8 years ago

@raynos Ah, I intended to address "Once promises are in core we need to introduce excessive tooling to once again ban them" — specifically, it should just be a one-liner that can be placed at app-level, introduced as a single-line preload module, or standardized as a command line flag to make that easy to implement at an operational level. That is, turning off the core promise API will be easy to do at an application level — which seems like it would address the root of your concern.

davepacheco commented 8 years ago

@chrisdickinson That one-liner only guarantees that the program will crash when promises are used, right? If the code path using promises is not executed during program startup, you might not find out for a long time? If true, this doesn't allow one to identify modules using promises unless you also exercise all code paths.

chrisdickinson commented 8 years ago

@davepacheco That is true, though the failure should be within reach of most test suites. Additionally, --abort-on-uncaught-exception should produce a core that allows easy identification of the offending API.

benjamingr commented 8 years ago

@davepacheco great use case example at:

A prototypical use-case for postmortem debugging would be: during an http 'request' handler that calls through several different modules while handling the request, the program ends up calling a JavaScript function that attempts to access property "foo" of a value that's undefined. The program crashes and dumps core. The core file allows the developer to see all of the information about the http request (method, url, headers, bytes read), all of the handlers on the stack, the arguments to each of these handlers, the "this" argument to each function on the stack, and so on. The value is that sometimes, the stack trace just isn't enough, and this other information helps nail the problem. Besides that, all this works even if the problem is not reproducible even once, and it does not require you to stop the program in a debugger (so you can use this in production).

I would never personally use core dumps to debug these sort of errors (proper error handling means that the Error instance itself carries the parts of that information that are relevant) but I can see the merit.

Does this work across the asynchronous call stack? That is, is there a way with tooling to get access to the entire asynchronous call chain?

With promises, the unhandledRejection event contains (in addition to the rejection reason) the promise itself that rejected. This means that whatever is performing the asynchronous tracking can probably keep a track of that chain (it's available in the debugger for instance) - so that should be possible if it is with callbacks (although I assume it doesn't follow variables in both since there is no reason for them to stay allocated in the callstack without async debugging turned on with slows things down considerably).

Does the following example generate an acceptable core dump?

Promise.resolve().then(x => {y = 5; throw Error(); }).catch(e => {
    process.exit(1);
});

rnchamberlain commented 8 years ago

Not sure how relevant this is, but in Java you can catch anything (eg Throwable). Applications can for example catch OutOfMemoryError, and attempt to continue. Likewise NullPointerException or ClassNotFoundException. So we often lose the original failure context and stack in the post-mortem diagnostics. The Java solution is to have trace and dump triggers on 'throw' and 'catch' as well as on 'uncaught', plus some selection/filters on those. The proposed --abort-on-unhandled-rejection looks a bit similar.

davepacheco commented 8 years ago

Does this work across the asynchronous call stack? That is, is there a way with tooling to get access to the entire asynchronous call chain?

If you mean: does the callback for the end of an async operation include some reference the stack that kicked off the operation -- I don't know a way to get that. But it's easy to add that yourself on an ad-hoc basis, and if the runtime were to start tracking it as you alluded to, then that information would be available in the core file.

bmeck commented 8 years ago

That is, is there a way with tooling to get access to the entire asynchronous call chain?

@davepacheco @benjamingr async_wrap can be used to tool this pretty easily, but I don't think it works with the Promise Job Queue. It by default does not keep reference to the stack it created the async operation from, but it does have the hooks if you wished too. Hard to avoid leaks if you start keeping a chain of stacks though, needs to be enabled manually.

benjamingr commented 8 years ago

But it's easy to add that yourself on an ad-hoc basis, and if the runtime were to start tracking it as you alluded to, then that information would be available in the core file.

I don't understand, you made a very compelling argument about inspecting the this and arguments of every function and now you're telling me it doesn't follow the async stack.

With promises I get a complete async call stack (like event emitters) but not the arguments, I'd love to be able to get the arguments and this value, but if I can't get that with callbacks either I'd rather pick the solution that lets me give me the async call stack than the one that does not. I'd love to be able to have a solution that gives me not only the async call stack but also the arguments and this value - if callbacks can do that and promises prevent them from doing so that's definitely something to consider.

bmeck commented 8 years ago

@benjamingr can you explain **complete** async call stack / how you are getting it. I think we might be talking about slightly different things.

benjamingr commented 8 years ago

@bmeck sure, I have something like:

function foo(cb) {
    setTimeout(cb, 100);
}
function bar(cb) {
    setTimeout(cb, 15);
}
function baz() {
    bar(function() { foo(bak); });
}
function bak(data) {
     throw new Error();
}
baz();

I need a call stack here that says bak was called inside foo that was called inside an anonymous function that was called inside bar that was called inside baz.

bmeck commented 8 years ago

@benjamingr and you said that Promises give you this without you turning anything on?

With promises I get a complete async call stack

I am wondering how, I know even in Chrome async stacks are only enabled manually due to memory leaks.

benjamingr commented 8 years ago

@bmeck I run most of my servers happily with bluebird with longStackTraces turned on in production, it's a lot faster than Chrome's async stack traces but it does incur a slight performance penalty. It doesn't give me everything I want but it's reasonably fast (and doesn't leak memory).

bmeck commented 8 years ago

@benjamingr

longStackTraces are not available to native promises, on same level as using something like https://www.npmjs.com/package/longjohn
built in event loop hooks work for everything except the Promise Job Queue, this would need to be changed if we want to get arguments. The way bluebird does this does not allow for getting arguments safely (async stacks in chrome actually capture the whole scope, not just arguments)
Bluebird does pruning and does not give a complete stack trace (unlike chrome), and so misses recursive operations:

"use strict";
Promise = require('bluebird');
Promise.longStackTraces();
var promprom = Promise.resolve(-1);
var max = 11;
function next(i) {
  console.log('nesting');
  if (i > max) {
    throw new Error('long error');
  }
  else {
    return Promise.resolve(i+1).then(next);
  }
}
next(0).catch(e => {
  console.error(e.stack);
});

Raynos commented 8 years ago

Whenever I debug an ubcaught I do not use long stack traces.

I just look at the stack that got me where I am in the current tick. For performance reasons I avoid all closures where possible and allocate named constructors for every "async.waterfall/series" type of code.

I never need a long stack trace, all the information and state needed is on "this". This type of closure less programming is very debuggable and I assume @davepacheo does similar things.

vkurchatkin commented 8 years ago

This will be the first feature in node core that cannot be used.

Why? It can be used and will be used by some people, will be ignored by some other people. Like everything else.

amazing post mortem tooling out of the box

Kind of not true. "Out of the box" tooling is supposed to work on major OSes at least.

Promises are a part of the language. If you write "high performance javascript", then you probably have tooling that bans anything that is not ES3 in place already.

benjamingr commented 8 years ago

@bmeck sorry, I missed that comment.

longStackTraces are not available to native promises

Yes, this is something native promises will have to fix first. I was merely talking about my debugging flow. The long stack traces provided by promise libraries are better than tools like longjohn because they are aware of how you combine promises (and not just entry).

built in event loop hooks work for everything except the Promise Job Queue, this would need to be changed if we want to get arguments. The way bluebird does this does not allow for getting arguments safely (async stacks in chrome actually capture the whole scope, not just arguments)

Would you mind elaborating on that? I think Node doesn't use v8's automatic promise queue thing - but @vkurchatkin is here and he wrote that code IIRC so maybe he can elaborate on how much work it'd take to instrument promises in Node like this.

nodejs / post-mortem

promises and post-mortem debugging #16