nodejs / node

Node.js JavaScript runtime ✨🐢🚀✨
https://nodejs.org
Other
107.65k stars 29.62k forks source link

stdio buffered writes (chunked) issues & process.exit() truncation #6456

Closed eljefedelrodeodeljefe closed 3 years ago

eljefedelrodeodeljefe commented 8 years ago

If this is currently breaking your program, please use this temporary fix:

[process.stdout, process.stderr].forEach((s) => {
  s && s.isTTY && s._handle && s._handle.setBlocking &&
    s._handle.setBlocking(true)
})

As noted in #6297 async stdio will not be flushed upon immediate process.exit(). This may lay open general deficiencies around C exit() from C++ functions not being properly unwound and is probably not just introduced by latest libuv updates. It should be considered to add flushing, providing graceful exit and/or improving unwinding C++ stacks.

cc @jasnell, @kzc, @Qix-, @bnoordhuis

Issues

Discussion has been already taking place at several places, e.g. #6297, #6456, #6379

Summaries of Proposals

proposals are not exclusive and could lead to semantically unrelated contributions.

  • aid with process.stdout.flush()
  • process.setBlocking(true)
  • node --blocking-stdio
  • longjmp() towards main at exit in C++
  • move parts of process.exit() / process.reallyExit() to new method os.exit()
  • golang panic()- or c++ throw-like stack unwinding

    Discussions by Author (with content)


@ChALkeR I tried to discuss this some time ago at IRC, but postponed it for quite a long time. Also I started the discussion of this in #1741, but I would like to extract the more specific discussion to a separate issue.

I could miss some details, but will try to give a quick overview here.

Several issues here:

  1. Many calls to console.log (e.g. calling it in a loop) could chew up all the memory and die — #1741, #2970, #3171.
  2. console.log has different behavior while printing to a terminal and being redirected to a file. — https://github.com/nodejs/node/issues/1741#issuecomment-105333932.
  3. Output is sometimes truncated — #6297, there were other ones as far as I remember.
  4. The behaviour seems to differ across platforms.

As I understand it — the output has an implicit write buffer (as it's non-blocking) of unlimited size.

One approach to fixing this would be to:

  1. Introduce an explicit cyclic write buffer.
  2. Make writes to that cyclic buffer blocking.
  3. Make writes from the buffer to the actual output non blocking.
  4. When the cyclic buffer reaches it's maximum size (e.g. 10 MiB) — block further writes to the buffer until a corresponding part of it is freed.
  5. On (normal) exit, make sure the buffer is flushed.

For almost all cases, except for the ones that are currently broken, this would behave as a non-blocking buffer (because writes to the buffer are considerably faster than writes from the buffer to file/terminal).

For cases when the data is being piped to the output too quickly and when the output file/terminal does not manage to output it at the same rate — the write would turn into a blocking operation. It would also be blocking at the exit until all the data is written.

Another approach would be to monitor (and limit) the size of data that is contained in the implicit buffer coming from the async queue, and make the operations block when that limit is reached.

Qix- commented 8 years ago

Perhaps a list of issues this would address and/or close would be helpful to include since this seems to be a sprawling issue with a lot of fragmented discussion.

eljefedelrodeodeljefe commented 8 years ago

Yes, just a little late in Europe :( keep 'em coming and I add them above.

vsemozhetbyt commented 8 years ago

Also see #6410

vsemozhetbyt commented 8 years ago

Considering all the clarification in the #6410, is there also a theoretical possibility that not only several I/O calls to stdout could not make it, but even one simple console.log() before process.exit() could be truncated or discarded?

Qix- commented 8 years ago

@vsemozhetbyt that is especially correct if I'm understanding your question correctly.

addaleax commented 8 years ago

@vsemozhetbyt If it’s big enough, definitely. See e.g. test/known_issues/test-stdout-buffer-flush-on-exit.js.

eljefedelrodeodeljefe commented 8 years ago

To reproduce you can do

require('crypto').randomBytes(100000000, function(err, buffer) {
  var token = buffer.toString('hex');
  console.log(token);
  process.exit(0)
});

Edit: @addaleax's hint: test does a similar thing. Sorry @addaleax

kzc commented 8 years ago

@vsemozhetbyt This output is truncated with node 6.0.0 on Mac after approx 40 lines:

node -e 'console.log("The quick brown fox jumps.\n".repeat(40000)); process.exit(7);'

node 5.x and earlier output all 40000 lines on Mac.

vsemozhetbyt commented 8 years ago

So now if user does not want to reflow the code all one has is to write something like

const err = {name: 'Error', message: 'something wrong'};
throw err;

instead of

console.log('Error: something wrong');
process.exit(1);

and to deal with all the uncontrolled clutter of debug output?

kzc commented 8 years ago

throw err; and to deal with all the uncontrolled clutter of debug output?

For dev code, sure, but uncaught exceptions in production code is not very elegant or professional.

kzc commented 8 years ago

Related: #6379

Also discusses process.stdio.setBlocking(Boolean)

eljefedelrodeodeljefe commented 8 years ago

added @chalkers thread and updated this issue with some summaries and stuff.

Qix- commented 8 years ago

@kzc

node 5.x and earlier output all 40000 lines on Mac.

Not sure what you're talking about.

#!/usr/bin/env bash
. ~/.nvm/nvm.sh

uname -a
echo

function do_buffer_test {
    node <<< 'console.log((new Array(40000)).join("Hello! this is a test!\n"));' | wc -l
    node <<< 'console.log((new Array(40000)).join("Hello! this is a test!\n")); process.exit(1)' | wc -l
    node <<< 'console.log((new Array(40000)).join("Hello! this is a test!\n")); process.reallyExit(1)' | wc -l
    node <<< 'console.log((new Array(40000)).join("Hello! this is a test!\n")); process.abort()' | wc -l
    node <<< 'for (var i = 0; i < 40000; i++) console.log("Hello! this is a test!");' | wc -l
    node <<< 'for (var i = 0; i < 40000; i++) console.log("Hello! this is a test!"); process.exit(1)' | wc -l
    node <<< 'for (var i = 0; i < 40000; i++) console.log("Hello! this is a test!"); process.reallyExit(1)' | wc -l
    node <<< 'for (var i = 0; i < 40000; i++) console.log("Hello! this is a test!"); process.abort()' | wc -l
}

nvm install 0.10
do_buffer_test

nvm install 0.12
do_buffer_test

nvm install 1
do_buffer_test

nvm install 2
do_buffer_test

nvm install 3
do_buffer_test

nvm install 4
do_buffer_test

nvm install 5
do_buffer_test

nvm install 6
do_buffer_test
$ ./test-buffers.sh
Darwin JunonBox.local 15.4.0 Darwin Kernel Version 15.4.0: Fri Feb 26 22:08:05 PST 2016; root:xnu-3248.40.184~3/RELEASE_X86_64 x86_64

v0.10.44 is already installed.
Now using node v0.10.44 (npm v2.15.0)
   40000
   40000
   40000
   40000
   40000
   40000
   40000
   40000
v0.12.13 is already installed.
Now using node v0.12.13 (npm v2.15.0)
   40000
   40000
   40000
   40000
   40000
   40000
   40000
   40000
iojs-v1.8.4 is already installed.
Now using io.js v1.8.4 (npm v2.9.0)
   40000
    2849
    2849
    2849
   40000
   40000
   40000
   40000
iojs-v2.5.0 is already installed.
Now using io.js v2.5.0 (npm v2.13.2)
   40000
    2849
    2849
    2849
   40000
   40000
   40000
   40000
iojs-v3.3.1 is already installed.
Now using io.js v3.3.1 (npm v2.14.3)
   40000
    2849
    2849
    2849
   40000
   40000
   40000
   40000
v4.4.3 is already installed.
Now using node v4.4.3 (npm v2.15.1)
   40000
    2849
    2849
    2849
   40000
   40000
   40000
   40000
v5.11.0 is already installed.
Now using node v5.11.0 (npm v3.8.6)
   40000
    2849
    2849
    2849
   40000
   40000
   40000
   40000
v6.0.0 is already installed.
Now using node v6.0.0 (npm v3.8.6)
   40000
    2849
    2849
    2849
   40000
   40000
   40000
   40000
$ ./test-buffers.sh
Linux -snip- 3.18.27 #1 SMP Wed Feb 17 01:14:23 UTC 2016 x86_64 GNU/Linux

######################################################################## 100.0%
Now using node v0.10.44 (npm v2.15.0)
Creating default alias: default -> 0.10 (-> v0.10.44)
40000
40000
40000
40000
40000
40000
40000
40000
######################################################################## 100.0%
Now using node v0.12.13 (npm v2.15.0)
40000
40000
40000
40000
40000
40000
40000
40000
Downloading https://iojs.org/dist/v1.8.4/iojs-v1.8.4-linux-x64.tar.gz...
######################################################################## 100.0%
Now using io.js v1.8.4 (npm v2.9.0)
40000
2849
2849
2849
40000
40000
40000
40000
Downloading https://iojs.org/dist/v2.5.0/iojs-v2.5.0-linux-x64.tar.xz...
######################################################################## 100.0%
Now using io.js v2.5.0 (npm v2.13.2)
40000
2849
2849
2849
40000
40000
40000
40000
Downloading https://iojs.org/dist/v3.3.1/iojs-v3.3.1-linux-x64.tar.xz...
######################################################################## 100.0%
Now using io.js v3.3.1 (npm v2.14.3)
40000
2849
2849
2849
40000
40000
40000
40000
Downloading https://nodejs.org/dist/v4.4.3/node-v4.4.3-linux-x64.tar.xz...
######################################################################## 100.0%
Now using node v4.4.3 (npm v2.15.1)
40000
2849
2849
2849
40000
40000
40000
40000
Downloading https://nodejs.org/dist/v5.11.0/node-v5.11.0-linux-x64.tar.xz...
######################################################################## 100.0%
Now using node v5.11.0 (npm v3.8.6)
40000
2849
2849
2849
40000
40000
40000
40000
Downloading https://nodejs.org/dist/v6.0.0/node-v6.0.0-linux-x64.tar.xz...
######################################################################## 100.0%
Now using node v6.0.0 (npm v3.8.6)
40000
2849
2849
2849
40000
40000
40000
40000

Looks to me whenever io.js forked is when this started happening. Perhaps @indutny can shed some light on the subject.

eljefedelrodeodeljefe commented 8 years ago

Adding two other possibilities.

kzc commented 8 years ago

@Qix- @eljefedelrodeodeljefe Please understand that when you pipe the results you are changing the test. It follows a different code path in node and libuv. You have to observe it on the terminal. So in that regard, the test is more difficult to automate.

I am observing this behavior of Mac OS X 10.9.5. The behavior is different on Mac and Linux. Mac stdout appears to have had blocking writes to the tty historically. See https://github.com/nodejs/node/issues/6297#issuecomment-213964747

kzc commented 8 years ago

+1 for new function os.exit(Boolean) that drains stdout/stderr upon exit and leave process.exit() as is.

Actually may have to leave process.exit as is because of the prevalence of workarounds to the stdout flushing problem such as node-exit which might break if the behavior of process.exit changes.

saghul commented 8 years ago

If memory serves right, and by looking at the results posted by @Qix- I think this is where things started to change: https://github.com/libuv/libuv/commit/b197515367d1a996dca9009483d202b306f9474e Because we started to open writable TTYs in non-blocking mode. Follow the commit trail for reasonin, reverting is not an option.

kzc commented 8 years ago

reverting is not an option.

Yes, and that particular commit also introduced the tty redirection bug in src/unix/tty.c that was fixed in libuv 1.9.0.

saghul commented 8 years ago

@kzc your point being?

kzc commented 8 years ago

Just adding weight to reverting is not an option.

kzc commented 8 years ago

@saghul I will add that prior to the tty redirection fix, Mac stdout appeared to be blocking based on my observations with process.exit tests never truncating tty output on Mac. As of that tty redirect fix it is now non-blocking to make it on par with Linux behavior.

bnoordhuis commented 8 years ago

Can I suggest closing out all other issues with a "discussion continues in #6456" comment?

jasnell commented 8 years ago

+1 ... we don't need multiple issues covering the same thing.

jasnell commented 8 years ago

See: https://github.com/nodejs/node/pull/6477

chalkers commented 8 years ago

@eljefedelrodeodeljefe you got the wrong @ChALkeR :)

bcoe commented 8 years ago

This is causing some fairly wonky behavior with yargs, two questions:

  1. should we be classifying this as a bug (the new flushing behavior seems unintuitive), if so is there a separate tracking ticket I should be following?
  2. is there a recommended workaround, or should I hold my horses for a patch.

Between commander, yargs, and optimist (all of which now exhibit broken behavior) this is going to be hitting a lot of people (about 1,600,000 installs a day).

kzc commented 8 years ago

@bcoe You won't find consensus on the "process.exit() not flushing stdio" issue because many node devs don't think it's a problem.

If you must use process.exit() the only known workaround is have stdout and stderr block at application start:

process.stdout._handle.setBlocking(true);
process.stderr._handle.setBlocking(true);

Cue the "it's not supported" and "that's not the node way" rebuttals...

eljefedelrodeodeljefe commented 8 years ago

@bcoe this is the tracking ticket. If possible don't do ._handle.setBlocking, since this will affect the whole process users start with yargs. There will definitely be no revert on the libuv side and this shouldn't be considered a bug. No-one has come up with a decent workaround, around flushing and properly unwinding on exit. I think it's gonna take a while.

There are workarounds however that would be immediately possible, but would require refactoring, namely avoiding the programmtic use of exit handlers. Where exactly in the code base is that a problem?

bcoe commented 8 years ago

@eljefedelrodeodeljefe the problem is less with yargs itself, and more with the consuming library. With optimist, yargs, and (I would guess) commander, there are commands that force an exit preventing the program consuming the library from attempting to handle the parser output:

var argv = require('yargs')(['--help'])
  .help()
  .argv

console.log('we should never get here');

The above code would never hit the console.log line, and would process.exit(0);. Perhaps this would be an acceptable workaround?

if (shouldExit) {
process.stdout._handle.setBlocking(true);
process.stderr._handle.setBlocking(true);
console.log(yargs.help());
process.exit(0);
}

avoid setting stdout and stderr to blocking until we already know we are about to exit?

eljefedelrodeodeljefe commented 8 years ago

dragging in @bnoordhuis. Have you worked on this in the meantime? Would that be an acceptable hotfix until we come up with a proper solution?

sindresorhus commented 8 years ago

Using process.exit() is common convention in CLI tools. The change in Node.js 6 has pretty much broken everything CLI related... I use process.exit() in meow which a lot of packages depend on (5,312,249 downloads in the last month).

process.exit() will be especially useful when ES2015 modules comes to Node.js, as we can then no longer return in the top-scope, so short-circuiting will be effectively impossible, without a nesting mess.

eljefedelrodeodeljefe commented 8 years ago

Yeah, I know. However this really bad, sorry. It's something we need to live with now. Proper handling there would have been "returning from main" or use event emitters. The problems w/ synchronous and async behaviours are documented though.

Ah, the last point is interesting

eljefedelrodeodeljefe commented 8 years ago

The solution probably will be to have a function forcing the flush. From a style point of view this whole exit handler business seems bad still though :(

eljefedelrodeodeljefe commented 8 years ago

@Fishrock123 can you pick up @sindresorhus comment on not being able to return from top-scope in ES2015 modules. At least we'll probably need documentation about this.

kzc commented 8 years ago

Perhaps this would be an acceptable workaround?

Only if you can guarantee that nothing else was output previously.

node 6.0.0 on Mac terminal:

$ node -e "console.log('The quick brown fox jumps.\n'.repeat(40000)); process.stdout._handle.setBlocking(true); console.log('Usage: ...'); process.exit(1);"
The quick brown fox jumps.
The quick brown fox jumps.
The quick brown fox jumps.
... 30 lines deleted ...
The quick brown fox jumps.
The quick brown fox jumps.
The quick brown $ 
kzc commented 8 years ago

Patch to flush process.stdout and process.stderr upon process.exit() on unix:

https://github.com/kzc/node/commit/92fc9e0d992f043a4b92d9d286514328f5df1b6d

Tested successfully on Mac. Should work on Linux as well.

No attempt made at a Windows fix, but if one is needed it would follow the same idea in libuv. Not sure if this issue exists on Windows, as a few comments in the code suggest stdout/stderr blocks on that platform.

If someone wants to refine this patch and get it merged into node, go for it.

eljefedelrodeodeljefe commented 8 years ago

I might have found something less intrusive: https://github.com/nodejs/node/pull/6735 basically it's my favorite: a no-op :) Should work fine there though, passes tests and is backwards compatible.

eljefedelrodeodeljefe commented 8 years ago

Scratch that. Need to refine...

addaleax commented 8 years ago

I think @kzc’s suggestion is definitely worth pursuing, but I don’t know the situation on Windows either.

Fishrock123 commented 8 years ago

Whoops, only seeing this now. There are some things missing here, standby.

Fishrock123 commented 8 years ago

This is the original issue: https://github.com/nodejs/node/issues/784

In it, @vkurchatkin found that this patch "fixes" the issue:

diff --git a/lib/net.js b/lib/net.js
index 030083d..efebd03 100644
--- a/lib/net.js
+++ b/lib/net.js
@@ -135,8 +135,7 @@ function Socket(options) {
     this._handle = createHandle(options.fd);
     this._handle.open(options.fd);
     if ((options.fd == 1 || options.fd == 2) &&
-        (this._handle instanceof Pipe) &&
-        process.platform === 'win32') {
+        (this._handle instanceof Pipe)) {
       // Make stdout and stderr blocking on Windows
       var err = this._handle.setBlocking(true);
       if (err)

There is also significant background in my attempted patch, using the above code: https://github.com/nodejs/node/pull/1771

Namely this, by @bnoordhuis:

A bit of background: some years ago, I think it was in v0.7, it was decided to make stdout and stderr blocking. Turns out it doesn't work so well for pipes; ttys and files are usually very fast (local ones anyway) but pipes tend to fill up rapidly.

A number of people complained about it so we made stdio-to-pipe non-blocking again (except on Windows, where it's not supported.) I forgot the exact bug reports but the theme was that stdio was too slow; on OS X, the kernel pipe buffer is only about 4 kB, so it's easy to max out.

I believe the issue is now that people complain that output sometimes goes missing at program exit. Ideally, we'd have some way to tell libuv "flush only stdio writes, don't do other I/O" but that may not be straightforward to implement.

As an interim solution, this PR seems fine to me, although I can't predict if or how much it will break existing applications.

The last bit is why my patch didn't land. Smoke testing did not exist at the time, the patch is unideal and may break countless things downstream.

There is also links to @bnoordhuis's proposal to fix this in libuv: https://github.com/libuv/libuv/issues/428, however it was decided that it is probably better that node handle this.

I don't seem to recall us ever finding where exactly it appeared though.

If memory serves right, and by looking at the results posted by @Qix- I think this is where things started to change: https://github.com/libuv/libuv/commit/b197515367d1a996dca9009483d202b306f9474e Because we started to open writable TTYs in non-blocking mode. Follow the commit trail for reasonin, reverting is not an option.

Parts of this issue should go back to pre-1.0.0 .. perhaps it was amplified recently but this sounds like a conflation of multiple issues now.


This is causing some fairly wonky behavior with yargs, two questions:

  1. should we be classifying this as a bug (the new flushing behavior seems unintuitive), if so is there a separate tracking ticket I should be following?
  2. is there a recommended workaround, or should I hold my horses for a patch.

@bcoe 1. Yes. 2. Avoid process.exit() to preserve chunked stdio writes.

The change in Node.js 6 has pretty much broken everything CLI related...

@sindresorhus This goes back to v1.0.0?

Again, sounds like multiple issues, or aplification of the existing one?

+1 for new function os.exit(Boolean) that drains stdout/stderr upon exit and leave process.exit() as is.

Strongly disagree. This is a bug that ought to be fixed.


My suggestion from "process: add process.exitSoon()" (https://github.com/nodejs/node/pull/6477) is as follows:

Make process.exit() (or rather, void Exit()):

  • uv_stop() (I think) the event loop
    • or whatever to stop anything new from happening but while keeping the threads alive to do writes
  • attempt to flush any data
  • exit

At the same time, I don't think we should alter process.abort().

Edit: it is possible that this is out of scope for process.exit(), but even if we add something new (which should be the fallback), it should have that behavior.

I'm now pretty sure it is within scope, although I'm not sure how possible my idea is.

Note: I have not yet had time to look at @kzc's patch.

saghul commented 8 years ago

uv_stop() (I think) the event loop

or whatever to stop anything new from happening but while keeping the threads alive to do writes

Writes happen in the loop thread, there are no other threads doing the writes. So if the loop is stopped no data will be written.

Note: I have not yet had time to look at @kzc's patch.

The patch is basically what @bnoordhuis proposed but at the handle level instead of a single global function.

Fishrock123 commented 8 years ago

Writes happen in the loop thread, there are no other threads doing the writes. So if the loop is stopped no data will be written.

Hmmm. What I mean by that may be more useful then: Shut down as much as possible so no other JS code runs.

saghul commented 8 years ago

I see. So some uv_walk + uv_close all handles except the ones in use for stdio + one last uv_run then.

eljefedelrodeodeljefe commented 8 years ago

I was thinking about this too. Seems legit. Is it not possible to do this in streamwrap, too? Attach process.exit to the last write of the stream that is currently happening?

jasnell commented 8 years ago

@saghul ... I would describe it slightly differently: What we need is essentially a uv_graceful_stop() that:

  1. Puts the loop into a 'stopping' mode that disallows any new requests on handles,
  2. Closes all handles that do not have existing pending requests,
  3. Allows existing pending requests on handles to complete,
  4. Closes the remaining handles when all requests are complete,
  5. Optionally creates a timer to force handles to close / requests to cancel if requests take too long to complete.
Fishrock123 commented 8 years ago
  • Allows existing pending requests on handles to complete,
  • Closes the remaining handles when all requests are complete,

I'm not sure these are within scope, if you're calling process.exit() you are telling the process to ignore other connections and shut down?

  • Optionally creates a timer to force handles to close / requests to cancel if requests take too long to complete.

That's definitely going somewhere beyond this imo

jasnell commented 8 years ago

I respectfully disagree. Allowing the existing requests to complete is what this discussion is about, yes? For me the requirement here is to have a graceful exit option. There are times when what you want is to shutdown immediately without completing the pending tasks and there are times when what you want is to shutdown cleanly with pending tasks completed or given a chance to clean up. Personally I do not want (nor do I believe it is necessary) to change the existing behavior of process.exit(). What I want is the ability to simply say, "Hey, we're shutting things down now, please finish what you're doing".

Fishrock123 commented 8 years ago

Allowing the existing requests to complete is what this discussion is about, yes?

This is about having stdio finish chunked writes, mostly.

There are times when what you want is to shutdown immediately without completing the pending tasks and there are times when what you want is to shutdown cleanly with pending tasks completed or given a chance to clean up.

I don't disagree, but that isn't process.exit()'s worry.

Personally I do not want (nor do I believe it is necessary) to change the existing behavior of process.exit(). What I want is the ability to simply say, "Hey, we're shutting things down now, please finish what you're doing".

Sure, but this is actually a bug. This used to work, why it doesn't currently is complex and awkward, but we should still fix it. (Also it violates users expectations far beyond just exiting before other connections.)

saghul commented 8 years ago

What we need is essentially a uv_graceful_stop() that:

I'm not sure how usefult this is for the general public, seems very Node specific, but let's see:

Puts the loop into a 'stopping' mode that disallows any new requests on handles,

Doable with a flag on the loop.

Closes all handles that do not have existing pending requests,

Not all handles have requests associated with them, and there are also standalone requests. If we close all handles which don't have requests we'd also close the idle and check handles used for process.nextTick, it would turn things into a royal mess.

Allows existing pending requests on handles to complete,

Sure.

Closes the remaining handles when all requests are complete,

We'd need something new here, since currently, if you close a handle with pending requests, they are cancelled (if possible).

Optionally creates a timer to force handles to close / requests to cancel if requests take too long to complete.

Not all requests are cancellable.

Overall I think this an overkill approach for the problem at hand and I don't see it happening any time soon (unless someone wants to volunteer the time to come up with a thorough design proposal and implementation). This is about flushing stdio streams on exit, which a relatively specific task IMHO.