Closed ruyadorno closed 2 years ago
I'm really not sure how feasible it is and what an useful API would look like but I just want to make sure we have a place to follow up with the discussion, feel free to contribute more userland alternatives, desired outcomes and correct me if anything sounds wrong in the original description.
For reference, I was creating a script that inflates the output of a git object on the command line. The idea was I could pipe the output of cat .git/objects/<hash>
to node -e 'zlib.inflateSync(process.stdin.read()).pipe(process.stdout)'
for a quick inflate script as I test some code i'm writing.
The solution for my original problem was utilizing zlib.createInflate()
node -e 'process.stdin.pipe(zlib.createInflate()).pipe(process.stdout)'
This isn't a pressing feature for me either, but it is an interesting use case and benefit to the userland. So I'm happy to champion if we think it is a useful addition
Could Buffer.fromStream(stream, (error, buffer) => {})
be a reasonable API? This could be promisified to support const buffer = await fromStream(stream);
.
Additional possibly related prior art: https://github.com/maxogden/concat-stream/ (lacks an isTTY
check).
Isn't the pipe solution both more succinct, and will not load the entire git object into memory (exploding if its larger than available memory)?
Note that reading a stream of unknown length (stdin, for example) into a memory buffer involves double copying (at least) all the data, as well as potentially unconstrained memory allocation (streams don't have to have an end, cat /dev/zero | node -e '....'
).
Why would concat-stream need a isTTY check?
I agree the pipe solution is probably best for what @Ethan-Arrowood was doing and reading an entire stream into memory is inefficient / has risks. I have run into situations where "just get me the complete data" has been very useful (admittedly always in tests).
I don't think concat-stream needs a isTTY check, I was just commenting for the purpose of prior art you would have to guard against process.stdin.isTTY
to use it as a way to get the entire stdin buffer.
After reading @sam-github's comment and my own realization that I've never used this sort of functionality in production I have to ask is this something that even belongs in the node.js core?
After reading @sam-github's comment and my own realization that I've never used this sort of functionality in production I have to ask is this something that even belongs in the node.js core?
That is a great point @coreyfarrell I doubt I'd use the script for anything other than debugging/testing in my terminal.
now that TLA landed on master, combined with the async/await stream support we have a more reasonable interface for consuming process.stdin
in such eval oneliners:
-cat README.md | node -e "inp=''; process.stdin.on('data',c=>inp+=c).on('end',()=>process.stdout.write(inp.replace(/[a-z]/g, 'x')))"
+cat README.md | node -e "for await (const c of process.stdin) process.stdout.write(c.toString().replace(/[a-z]/g, 'x'))"
Note: Requires --experimental-top-level-await
and --input-type=module
flags
I'm pretty sure I heard someone in one of the meetings in the past mention it before but I couldn't find any issue so I'm filling this one so that we can track/discuss it.
oh, it sounds like that person may be @bengl (ref: https://twitter.com/bengl/status/1402014875870765057)
@ruyadorno
It looks like the future looks something like this:
cat README.md | node -e "process.stdout.write(Buffer.concat(await process.stdin.toArray()).toString().replace(/[a-z]/g, 'x'))"
It would be nice to get rid of the Buffer.concat
with some version of flat()
that works interoperably between Arrays and TypedArrays.
Alternatively, some kind of async .concat()
or .consume()
on buffer streams would be pretty cool. On non-buffer streams it could give an array of all the objects.
If there's a worry that the promise would never resolve if the stream doesn't end: Consider that we already have this effect synchronously with [...iteratorThatNeverEnds]
.
oh thanks @bengl ! following your PR I learned about the Stream consumers API, that might be what I was looking for when I first opened this issue originally:
const { text } = require('stream/consumers')
await text(process.stdin)
I'm pretty sure I heard someone in one of the meetings in the past mention it before but I couldn't find any issue so I'm filling this one so that we can track/discuss it.
Description
A different interface for reading from standard input, making it simpler to create scripts that just need to consume a blob of data from stdin at once.
Desired behaviors
Userland alternatives
/cc @Ethan-Arrowood