Closed aight8 closed 7 years ago
Can you provide some example code that demonstrates the problem?
I cannot reproduce in Windows 7 x64:
const fs = require('fs');
const cp = require('child_process');
console.log(process.version);
fs.writeFileSync('data.txt', 'x'.repeat(8192 + 100));
console.log(fs.statSync('data.txt').size);
cp.exec('cat data.txt', (error, stdout, stderr) => {
console.log(stdout.length);
});
v7.10.0
8292
8292
Example:
const child_process = require('child_process')
const stdout = child_process.execFileSync(
'yarn',
['info', 'joi', '--json'],
{
encoding: 'utf8',
maxBuffer: 500 * 1024
}
)
console.log('Stdout length:', stdout.length)
try {
const stdoutObject = JSON.parse(stdout)
console.log('stdout is a valid JSON object.')
} catch (err) {
console.log('stdout is an invalid JSON object.')
console.error(err.toString())
}
Output:
$ node ./stdout-8192.js
Stdout length: 8192
stdout is an invalid JSON object.
SyntaxError: Unexpected token m in JSON at position 8188
# bytes
$ yarn info joi --json | wc -c
11222
# lines
$ yarn info joi --json | wc -l
1
edit:
On my virtual machine with OS Ubuntu 16.04.2 LTS
it works btw:
Stdout length: 11222
stdout is a valid JSON object.
but node.js v7.4.0
edit2:
updated node.js to v7.10.0
on the ubuntu VM. it's the same output than with v7.4.0
.
edit3: Your example:
const fs = require('fs');
const cp = require('child_process');
console.log(process.version);
fs.writeFileSync('data.txt', 'x'.repeat(8192 + 100));
console.log(fs.statSync('data.txt').size);
cp.exec('cat data.txt', (error, stdout, stderr) => {
console.log(stdout.length);
});
works on my macOS also correctly.
EDIT 4!:
The same code but using npm
(have the same arguments in this case) works well on macOS also. I don't know what yarn
doing else, but this problem should be investigated since there is inconsistent behaviour across different operating systems.
Here is the test case isolated from yarn, npm ect.
const cp = require('child_process');
if (process.argv[2] === 'child') {
console.log('g'.repeat(8193).toString());
}
else {
const child = cp.spawn(process.execPath, [__filename, 'child']);
child.stdout.on('data', function(d) {
console.log(d.toString());
});
}
In MAC OS + v7.10, a new line is inserted between 8192nd and 8193rd character, possibly buffering is happening at 8K. This is not happening in Linux.
Due to this, the incoming JSON gets corrupted.
@gireeshpunathil Can you write that script as a test case that makes clear what the expected outcome is (i.e. that crashes otherwise)? I’ve tried a variant of your script in a Travis CI run and it seems to work fine on OS X.
In MAC OS + v7.10, a new line is inserted between 8192nd and 8193rd character, possibly buffering is happening at 8K
Isn't that expected as the data
event is fired twice? Changing that to child.stdout.pipe(process.stdout);
seems to print data correctly.
Shouldn't the sync. functions return all the received data? It's an inconsitent behaviour anyway.
I can reproduce the issue on OS X
with the first example in https://github.com/nodejs/node/issues/12921#issuecomment-300207668 and with an async version:
const child_process = require('child_process')
const child = child_process.spawn(
'yarn',
['info', 'joi', '--json']
);
let stdout = '';
child.stdout.on('data', (d) => {
stdout += d;
});
child.stdout.on('close', () => {
try {
const stdoutObject = JSON.parse(stdout)
console.log('stdout is a valid JSON object.')
} catch (err) {
console.log('stdout is an invalid JSON object.')
console.error(err.toString())
}
});
By running the code with dtruss the child process only writes 8192 bytes. Later on it exits, so could it be an specific issue with yarn interoperating with node child_process
module 🤷♂️ ?
Dtruss output. Parent Process PID: 31586 | Child PID: 31587:
31587/0x61cba: write(0x1, "{\"type\":\"inspect\",\"data\":{\"name\":\"joi\",\"description\":\"Object schema validation\",\"dist-tags\":{\"latest\":\"10.4.1\",\"node-0.x\":\"6.10.1\",\"9-beta\":\"9.0.0-9\"},\"versions\":[\"0.0.1\",\"0.0.2\",\"0.0.3\",\"0.0.4\",\"0.0.5\",\"0.0.6-1\",\"0.0.6\",\"0.0.7\",\"0.0.8\",\"0.0.9\",\"0.0.10\",\"0", 0x2BD6) = 8192 0
31586/0x61cac: kevent(0x5, 0x7FFF5FBF6C30, 0x5) = 1 0
31586/0x61cac: read(0xD, "{\"type\":\"inspect\",\"data\":{\"name\":\"joi\",\"description\":\"Object schema validation\",\"dist-tags\":{\"latest\":\"10.4.1\",\"node-0.x\":\"6.10.1\",\"9-beta\":\"9.0.0-9\"},\"versions\":[\"0.0.1\",\"0.0.2\",\"0.0.3\",\"0.0.4\",\"0.0.5\",\"0.0.6-1\",\"0.0.6\",\"0.0.7\",\"0.0.8\",\"0.0.9\",\"0.0.10\",\"0", 0x10000) = 8192 0
31587/0x61cba: open("/Users/sgimeno/Library/Caches/Yarn/.roadrunner.json\0", 0x1000601, 0x1B6) = 14 0
31587/0x61cba: mmap(0x187DABEB8000, 0x139000, 0x0, 0x1042, 0xFF000000, 0x0) = 0x187DABEB8000 0
31587/0x61cba: munmap(0x187DABEB8000, 0x48000) = 0 0
31587/0x61cba: munmap(0x187DABFB9000, 0x38000) = 0 0
31587/0x61cba: mmap(0x187DABF00000, 0xB9000, 0x3, 0x1012, 0xFF000000, 0x0) = 0x187DABF00000 0
31587/0x61cba: pwrite(0xE, "{\n \"path\": {\n \"{\\\"request\\\":\\\"../lib/cli/index.js\\\",\\\"paths\\\":[\\\"/Users/sgimeno/.nave/installed/7.10.0/lib/node_modules/yarn/bin\\\"]}\": \"/Users/sgimeno/.nave/installed/7.10.0/lib/node_modules/yarn/lib/cli/index.js\",\n \"{\\\"request\\\":\\\"chalk\\\",\\\"paths\\\"", 0xB6166, 0x0) = 745830 0
31587/0x61cba: close(0xE) = 0 0
31587/0x61cba: psynch_cvsignal(0x101582D70, 0x390000003D00, 0x3900) = 256 0
31587/0x61cc1: psynch_cvwait(0x101582D70, 0x360000003A00, 0x3600) = 0 0
31587/0x61cc1: psynch_cvsignal(0x101582D70, 0x3A0000003D00, 0x3A00) = 256 0
31587/0x61cc3: psynch_cvwait(0x101582D70, 0x370000003B00, 0x3700) = 0 0
31587/0x61cc1: __disable_threadsignal(0x1, 0x3A0000003D00, 0x3A00) = 0 0
31587/0x61cc3: psynch_cvsignal(0x101582D70, 0x3B0000003D00, 0x3B00) = 256 0
31587/0x61cc3: __disable_threadsignal(0x1, 0x3B0000003D00, 0x3B00) = 0 0
31587/0x61cc0: psynch_cvwait(0x101582D70, 0x380000003C00, 0x3800) = 0 0
31587/0x61cc0: psynch_cvsignal(0x101582D70, 0x3C0000003D00, 0x3C00) = 257 0
31587/0x61cc2: psynch_cvwait(0x101582D70, 0x390000003D00, 0x3900) = 0 0
31587/0x61cc0: __disable_threadsignal(0x1, 0x3C0000003D00, 0x3C00) = 0 0
31587/0x61cc2: __disable_threadsignal(0x1, 0x390000003D00, 0x3900) = 0 0
31587/0x61cba: __semwait_signal(0x2407, 0x0, 0x0) = 0 0
31587/0x61cba: __semwait_signal(0x2703, 0x0, 0x0) = 0 0
31587/0x61cba: __semwait_signal(0x1D07, 0x0, 0x0) = 0 0
31587/0x61cba: __semwait_signal(0x2803, 0x0, 0x0) = 0 0
31587/0x61cba: psynch_cvwait(0x101582D70, 0x3D0100003E00, 0x3D00) = -1 Err#316
...
31586/0x61cac: wait4(0x7B63, 0x7FFF5FBF68BC, 0x1) = 31587 0
```
31587/0x61cba: write(0x1, "…", 0x2BD6) = 8192 0
Okay, that’s curious. stdio should be blocking, so why doesn’t libuv write the rest of the 0x2bd6 bytes immediately after the first write()
? Should write() even do that for blocking I/O?
Alternatively, does Node somehow mess up making FD 1 blocking, and consequently uv_try_write doesn’t write the whole block (16 kB is Node’s current limit for using try_write, and if OS X is the only platform where the buffer is smaller than that by default, it might explain platform-specific weirdness) and “forgets” to write the rest?
(It seems links to comments are broken in GitHub lately: hashes are stripped. I've reported to Github about the issue. Be careful)
The problem seems to be that stdout
is non-blocking. From process.exit() documentation:
It is important to note that calling process.exit() will force the process to exit as quickly as possible even if there are still asynchronous operations pending that have not yet completed fully, including I/O operations to process.stdout and process.stderr.
This is a test case that reproduces the error:
const cp = require('child_process');
if (process.argv[2] === 'child') {
console.log('g'.repeat(81933).toString());
process.exit();
} else {
const child = cp.spawn(process.execPath, [__filename, 'child']);
child.stdout.pipe(process.stdout);
}
I guess yarn is calling process.exit() before waiting the data to be written out.
@santigimeno What kind of stream is process.stdout
here?
Nvm, found it in the docs. I don’t actually see the point in being so inconsistent about stdio – it doesn’t seem to me like there’s any sufficiently good reason to have async stdio. I’ll try to put a PR together and see if I’m proven wrong by our tests :smile:
@addaleax , @santigimeno - sorry, my test case was mis-leading. Agree, that is a manifestation of two callbacks, and the newline inserted by console.log
Having said that, again sorry to diverge the attention, but I guess there is something else happening other than the process.exit(). Here is why:
Continuation to my previous comment, here is a test case which fails even without process.exit(). Note that however, this fails only in MAC. dtruss showed nothing un-natural, other than erred psynch_cvwait calls.
48131/0xb70b9: write(0x1, "ggg...", 0x10000) = 8192 0
48131/0xb70b9: madvise(0x10229A000, 0x41000, 0x9) = 0 0
48131/0xb70b9: kevent(0x5, 0x7FFF5FBF6C30, 0x2) = 0 0
48131/0xb70b9: kevent(0x5, 0x7FFF5FBF6C30, 0x0) = 0 0
48131/0xb70b9: psynch_cvwait(0x103800830, 0x100000100, 0x0) = -1 Err#316
48131/0xb70b9: psynch_cvwait(0x10230AC40, 0x100000100, 0x0) = -1 Err#316
...
48129/0xb709e: wait4(0xBC03, 0x7FFF5FBF68FC, 0x1) = 48131 0
In the course of its journey to the natural exit, does not look like it prepared or attempted to write again! at least from the system perspective.
const cp = require('child_process');
const fs = require('fs');
if (process.argv[2] === 'child') {
fs.writeSync(process.stdout.fd, 'g'.repeat(0x10000).toString());
} else {
const child = cp.spawn(process.execPath, [__filename, 'child']);
child.stdout.on('data', function(d) {
console.log(d.length);
});
}
expected: 65536 actual: 8192
A more trueful replica of the yarn scenario is as follows:
const cp = require('child_process');
const fs = require('fs');
if (process.argv[2] === 'child') {
var str = new fs.SyncWriteStream(process.stdout.fd);
new Promise(function(res, rej) {
str.write(('g'.repeat(0x10000).toString()));
res();
}).then(function(gval) {
console.log('success!');
process.exit();
}, function(eval) {
console.log('error!');
process.exit();
});
} else {
const child = cp.spawn(process.execPath, [__filename, 'child']);
child.stdout.on('data', function(d) {
console.log(d.toString());
});
}
expected: 65536 actual: 8192
While 'blocking pipes' can be used to justify why stdout was made non-blocking, this brings more concerns on multipe API specs:
@gireeshpunathil process.stdout.fd
is non-blocking in your example, it's not surprising you see partial writes. Replace:
fs.writeSync(process.stdout.fd, 'g'.repeat(0x10000).toString());
With:
fs.writeSync(fs.openSync('/dev/stdout', 'w'), 'g'.repeat(0x10000).toString());
And the parent will receive the full output.
I don't see a bug here, just consequences of the trade-offs node makes.
@bnoordhuis - thanks, the proposed code performs full write. Agree, that the issues I pointed out are mere manifestations of the fact that stdout is non-blocking, and may warrant only minimal doc changes as caveats.
But how about point 1? Why the child would - in the absence of an explicit exit - come out without writing fully? Isn't it a bug?
@gireeshpunathil Point 1 is this?
write: non-blocking does not imply non-completion.
IIUYC, node normally waits until it's been able to write all backlogged data to the pipe. (The data having been written does not mean the reader consumed it, though.)
Example:
$ node -e 'process.stdout.write("x".repeat(1<<20), () => console.error("done"))' | \
sh -c 'sleep 2 && cat' | \
wc -c
# 2 seconds pass
done
1048576
The caveat here probably is that the pipe buffer on MacOS is 64 kB (try the example with 65536 and 65537), which means quite a bit of data can get lost in transit if the reader exits before consuming everything.
@bnoordhuis, yes - point 1. was about that itself.
Are you hinting that the reader (in my case, the parent node) is exiting pre-maturely? with a listener on the child data, I thought it will wait until the child end of the pipe is closed, isn't it the case?
with your example, I (the cat command) am able to get all the data - with or without sleeping - any amount.
#node -e 'process.stdout.write("x".repeat(65537), () => console.error("done"))' | sh -c 'cat' | wc -c
done
65537
#node -e 'process.stdout.write("x".repeat(0x100000), () => console.error("done"))' | sh -c 'cat' | wc -c
done
1048576
#node -e 'process.stdout.write("x".repeat(0x1000000), () => console.error("done"))' | sh -c 'cat' | wc -c
done
16777216
#
On the other side, in my example, even if I add a delay to the reader (parent), I don't get more data.
const cp = require('child_process');
const fs = require('fs');
if (process.argv[2] === 'child') {
fs.writeSync(process.stdout.fd, 'g'.repeat(0x10000).toString());
} else {
const child = cp.spawn(process.execPath, [__filename, 'child']);
child.stdout.on('data', function(d) {
console.log(d.length);
});
setTimeout(function() {
console.error('slept!');
}, 10000);
}
#node foo.js
8192
slept!
#
Evidently, child is exiting pre-maturely. (hint: process.stdout is a tty in your case, and a paired up socket in my case).
Am I missing anything?
@bnoordhuis - I reproduced the same issue in Linux as well, though with a larger data volume (1MB):
[pid 24092] write(1, "gggggggggggggggggggggggggggggggg"..., 1048576 <unfinished ...>
[pid 24086] <... epoll_wait resumed> {{EPOLLIN, {u32=12, u64=12}}}, 1024, 4294967295) = 1
[pid 24086] read(12, <unfinished ...>
[pid 24092] <... write resumed> ) = 128000
...
[pid 24092] munmap(0x7fb52c504000, 3149824 <unfinished ...>
[pid 24086] <... read resumed> "gggggggggggggggggggggggggggggggg"..., 65536) = 65536
...
[pid 24086] read(12, "gggggggggggggggggggggggggggggggg"..., 65536) = 62464
As you can see, the write from the child happened only once - which was read by the parent in two iterations. However, note that the single write by the child is still not complete. The child seems to have attempted to write only once, ignoring how much it has actually written.
manual page of write (mac) says:
When using non-blocking I/O on objects, such as sockets, that are subject to flow control,
write() and writev() may write fewer bytes than requested; the return value must be noted, and
the remainder of the operation should be retried when possible.
and I don't see an evidence of node re-attempting on fewer writes.
@gireeshpunathil Use process.stdout.write()
instead of fs.writeSync(process.stdout.fd)
and it should work.
@bnoordhuis , yes - that works fine. But to be honest, that looks to me like a circumvention to the original issue which yarn brought forth: the fact that stdio is now non-blocking is not contained well in sync methods or fs methdods which take native fds as input.
This is the route through which the data was written into:
#0 uv__fs_write (req=0x7fffe7c1b7c0) at ../deps/uv/src/unix/fs.c:710
#1 0x0000000001576d08 in uv__fs_buf_iter (w=0x7fffe7c1b910) at ../deps/uv/src/unix/fs.c:902
#2 uv__fs_work (w=0x7fffe7c1b910) at ../deps/uv/src/unix/fs.c:974
#3 0x000000000157966c in uv_fs_write (loop=0x21f5ae0, req=0x7fffe7c1b7c0, file=Unhandled dwarf expression opcode 0xf3) at ../deps/uv/src/unix/fs.c:1349
#4 0x00000000012eaba3 in node::WriteString(v8::FunctionCallbackInfo<v8::Value> const&) ()
#5 0x0000000000ad1cc4 in v8::internal::FunctionCallbackArguments::Call(void (*)(v8::FunctionCallbackInfo<v8::Value> const&)) ()
#6 0x0000000000b5423c in v8::internal::MaybeHandle<v8::internal::Object> v8::internal::(anonymous namespace)::HandleApiCallHelper<false>(v8::internal::Isolate*, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::FunctionTemplateInfo>, v8::internal::Handle<v8::internal::Object>, v8::internal::BuiltinArguments) ()
#7 0x0000000000b55075 in v8::internal::Builtin_HandleApiCall(int, v8::internal::Object**, v8::internal::Isolate*) ()
#8 0x000017aabf7843a7 in ?? ()
Though I tested only in Linux and Mac, this looks like is applicable to all the platforms, as there is no platform specificity in the write logic.
Evidently, there is no second attempt made as in a typical non-blocking fd case, as the fd in question is assumed to be a file device and write is performed in one shot. I would expect this write to be performed in the way uv__write does - catering to blocking and non-blocking descriptors.
Thinking about solutions - one of:
In the writeSync methods, if the fd belongs to standard streams, perform true async writes until completion. Drawback - we cannot associate the async writes with a callback, as there is no callback in the first place. Also, the "Returns the number of bytes written."
spec of writeSync mehod cannot be met.
Inhibit standars streams from being passed to writeSync methods, as it cannot be reliably implemented. Change the docs to that effect.
diff --git a/lib/fs.js b/lib/fs.js
index c05e9e5..5334070 100644
--- a/lib/fs.js
+++ b/lib/fs.js
@@ -651,6 +651,13 @@ fs.write = function(fd, buffer, offset, length, position, callback) {
// OR
// fs.writeSync(fd, string[, position[, encoding]]);
fs.writeSync = function(fd, buffer, offset, length, position) {
+ if(fd === process.stdout.fd ||
+ fd === process.stderr.fd ||
+ fd === process.stdin.fd)
+ throw new Error('fd ' + fd + ' belongs to non-blocking standard stream,' +
+ ' and cannot be used in a sync call');
+
+
if (isUint8Array(buffer)) {
if (position === undefined)
position = null;
Please suggest.
Inhibit standars streams from being passed to writeSync methods, as it cannot be reliably implemented. Change the docs to that effect.
There is no documentation to update because the .fd
property is undocumented. I added it in 7bdd05b as a debugging tool.
@bnoordhuis - thanks. Thinking further, I guess the scope of the above patch should expand to all the sync methods which receive arbitrary fds from the user land, and deal with KBs of data. Example: fs.writeSync and fs.readSync, non-example - fs.fchownSync
I will work on a PR
@gireeshpunathil What exactly are you proposing? I don't think anything needs to change.
the check mentioned in this comment to be applied on readSync and writeSync - to make sure that non-blocking fds do not propagate into Sync calls - as they cannot handle them reliably and completely.
If we don't do that, the API description for fs.writeSync will not match with the actuals - nor the doc does not explains any limitation to that effect.
It does exactly what it says. If your assumption is that fs.writeSync()
always writes the exact number of bytes requested, that's wrong. Partial writes can and do happen, even with file descriptors in blocking mode; that's why the function returns the number of bytes written.
@bnoordhuis - ok, thanks for the clarification - I did not think in that way. However, I have this doubt - the reason for returning fewer bytes than requested on a blocking write I guess is due to extraneous reasons outside of the I/O buffer limitations? otherwise if both blocking and non-blocking descriptor return in the same way on buffer exhaustion, what is the difference between them? I cannot find any document evidece in OS manuals about returning fewer bytes on blocking writes other than the fact that they return the number of bytes written.
The scenario of returning fewer bytes in the present case is clearly due to the fact that it is a non-blocking socket, replacing it with a tty or a blocking socket writes huge data in shot, and never returned with fewer bytes. I guess this is exactly the reason why the stdio's were made non-blocking?
The API doc looks right in its grammatical form, but one would expect (as in the yarn case) the data is written in fully, albeit blocking the thread, through the passed fd.
the reason for returning fewer bytes than requested on a blocking write I guess is due to extraneous reasons outside of the I/O buffer limitations?
Yes. For example, a write to an almost-full disk.
(Disclaimer: some file systems will reject the request with a ENOSPC error instead of doing a partial write. A robust program should handle both scenarios.)
@bnoordhuis - thanks, understood. I don't have more comments, and am fine with your assertions.
Okay, good. I'll remove the confirmed-bug label and close the issue.
Using
fork
,spawn
,exec
in async and sync mode - stdout stream only returns up to 8192 characters. (note that the output is without any newlines) Note thatmaxBuffer
does not let increase this maximum. But it's possible to lower it by setting it to a value below 8192.