Closed guest271314 closed 3 months ago
I did a little write up on this JavaScript Standard Input/Output: Unspecified.
I'm not sure I understand the part
Each message is serialized using JSON, UTF-8 encoded
will this allow reading binary data? Because at least node.js totally messes up at trying to read my simple image file:
$ <<<'UDYKMiAyCjI1NQr///////////////8=' base64 -d >white.ppm
$ file white.ppm
white.ppm: Netpbm image data, size = 2 x 2, rawbits, pixmap
$ <white.ppm nodejs -p '
process.stdin.setEncoding("binary");
var input = require("read-all-stdin-sync")();
Buffer.from(input).toString("base64")'
UDYKMiAyCjI1NQrvv73vv73vv73vv73vv73vv73vv73vv73vv73vv73vv73vv70=
So each time I want to run a JS-coded image algo, I have to first put a base64 encoder in front in the shell pipeline and then a base64 decoder wrapper script as the node.js program. Maybe this would be an opportunity to fix the mess.
Each message is serialized using JSON, UTF-8 encoded
The Native Messaging protocol, which I used to illustrate the differences between STDIO implementation of JavaScript runtimes, how sends the length of the message, then the message.
In JavaScript we typically read that length by writing the first 4 bytes of stdin
to a Uint32Array
, e.g., using node
executable
import { open } from "node:fs/promises";
// Read all bytes
async function readFullAsync(length, buffer = new Uint8Array(65536)) {
const data = [];
while (data.length < length) {
const input = await open("/dev/stdin");
let { bytesRead } = await input.read({
buffer
});
await input.close();
if (bytesRead === 0) {
break;
}
data.push(...buffer.subarray(0, bytesRead));
}
return new Uint8Array(data);
}
async function getMessage() {
// Extract message length
const header = new Uint32Array(1);
await readFullAsync(1, header);
// Read data up to and including message length
const content = await readFullAsync(header[0]);
return content;
}
So each time I want to run a JS-coded image algo, I have to first put a base64 encoder in front in the shell pipeline and then a base64 decoder wrapper script as the node.js program. Maybe this would be an opportunity to fix the mess.
There are ways to process (any) data without using base64.
A Uint8Array
representation of the data which you can get various ways can be spread to an Array
and serialized to JSON, then streamed in chunks of [0, 255, ...]
then reconstructed to a Uint8Array
and further written to a real-time stream.
What I propose here is for maintainers of JavaScript engines and runtimes and the JavaScript specification itself via ECMA-262 write out
stdin
synchronouslystdin
asynchronouslystdout
synchronouslystdout
asynchronouslystderr
synchronously and asynchronouslySomething like
let buffer = new Uint32Array(1);
std.read({buffer, sync: true});
let data = new Uint8Array(buffer[0]);
await std.read({buffer: data, async: true});
let len = new Uint8Array(new Uint32Array([message.length]).buffer);
await std.write({async: true, buffer:len});
For those classes to be implemented uniformly by JavaScript engines and runtimes for the ability to write the same STDIO code that runs the same in JavaScript engines and runtimes.
Right now that's not possible. We have to write different STDIO for each JavaScript runtime.
Typically the above can take the form of a Uint8Array
. However, that's not the only way to process data. An ArrayBuffer
could be used. WHATWG Streams could be used. All of the above can be used. The details can be sorted out. Options can be thought about and included in the deliverable. We are only talking about STDIO.
That's might be an omission that is not easily observable for people who are just running node
, or just running deno
. I experiment and test multiple JavaScript runtimes and engines, including but not limited to, node
, deno
, bun
, qjs
, tjs
and V8's d8
and SpiderMonkey's js
, among others.
Circa 2023 there are more JavaScript engines and runtimes that do not target the browser than JavaScript runtimes that do target the browser.
However, there is no compatibility, no uniformity.
If you hack multiple JavaScript engines and runtimes that omission glares. At least it does to me when writing code intended to be used commonly among JavaScript engines and runtimes.
So with the Uint8Array read example and my file above, the second half of items in the array should turn out as 255 then?
No. That's just an example that shows the lowest integer and highest integer that a Uint8Array
will have as an element. See TypedArray
s .
There's no action to take here.
Take a look at these Native Messaging hosts written in JavaScript; Node.js, Deno, Bun, QuickJS, txiki.js. They implement reading stdin and writing stdout differently.
The last time I check
d8
(V8) andjsshell
(SpiderMonkey) provide no means to read stdin and write stdout usingTypedArray
s (buffers).Node.js does not write more than 65536 to stdout without
process.stdout._handle.setBlocking(true)
, at least not during my testing; Deno, Node.js, Bun; txiki.js all require multiple reads to read 1 MB from stdin after reading the first 4 bytes, QuickJS reads the full 1 MB in one read.A common stdin/stdout.stderr module that can be imported (CommonJS, Ecmascript Modules, whatever) and is capable of assuming responsibility of writing string or buffer at author/application discretion will be very helpful as a common specification that can be implemented for JavaScript implementations.