uhop / stream-json

The micro-library of Node.js stream components for creating custom JSON processing pipelines with a minimal memory footprint. It can parse JSON files far exceeding available memory streaming individual primitives using a SAX-inspired API.
Other
978 stars 47 forks source link

Stream an object to reconstruct it on the frontend #138

Closed davidfou closed 4 months ago

davidfou commented 1 year ago

Hello :wave:

First of all, thanks for the hard work done on this library. Understanding how to use it properly can take time, though it is worth the investment.

With @justinberiot, we wanted to use it for a particular use case. In our context, the JSON is not large (the complete data can fit in memory without any issue). The complexity is much more on the fact it can take a lot of time to get it fully, and it can be a deeply nested structure.

Our objective is to have a frontend which builds incrementally whenever some information is available. We managed to do so thanks to your library. Our code is available on the repo davidfou/stream-json-object.

Code snippet ```js // Available at https://github.com/davidfou/stream-json-object/blob/main/gif/index.mjs import Stream from "node:stream"; import timers from "node:timers/promises"; import util from "node:util"; import StreamJSON from "stream-json"; import fetch from "node-fetch"; import _ from "lodash"; const updatePath = (path) => { const lastPathValue = path[path.length - 1]; if (typeof lastPathValue === "number") { return [...path.slice(0, -1), lastPathValue + 1]; } return path; }; await Stream.promises.pipeline( async function* () { const response = await fetch( "https://raw.githubusercontent.com/davidfou/stream-json-object/main/demo/solar_system.json" ); if (!response.ok) { throw new Error("Oups"); } yield* response.body; }, StreamJSON.parser({ streamKeys: false, streamValues: false, }), // Simulate some latency async function* (source) { for await (const chunk of source) { await timers.setTimeout(50); yield chunk; } }, async function* (source) { let path = []; for await (const chunk of source) { const lastPathValue = path[path.length - 1]; switch (chunk.name) { case "startArray": path = [...path, 0]; break; case "endArray": path = path.slice(0, -1); if (lastPathValue === 0) { yield { key: path, value: [] }; } path = updatePath(path); break; case "startObject": path = [...path, null]; break; case "endObject": path = path.slice(0, -1); if (lastPathValue === null) { yield { key: path, value: {} }; } path = updatePath(path); break; case "keyValue": path = [...path.slice(0, -1), chunk.value]; break; default: yield { key: path, value: chunk.name === "numberValue" ? parseFloat(chunk.value) : chunk.value, }; path = updatePath(path); } } }, // events sent by a backend and reconstruction of the object on the frontend async function* (source) { let out = null; let i = 0; for await (const chunk of source) { if (out === null) { out = typeof chunk.key[0] === "number" ? [] : {}; } out = _.set(out, chunk.key, chunk.value); i += 1; console.clear(); console.log( util.inspect(out, { depth: null, colors: true, breakLength: 171 }) ); } console.log("object recreated in %i steps", i); } ); ```

GIF

Here is what we wonder about:

sneko commented 4 months ago

In my case I have response body that cannot fit into Node.js string, so I wanted to use the stream response.body.getReader() with a "stream-to-json" library. It seems stream-json is one of the last ones maintained so I chose it.

But it's not obvious at all how to make it parsing the whole stream to build the proper object, without using filter or so. Because my object is nested (can be 100+ levels), I cannot rely on array or object helpers presented in the README.md.

Did I miss something to do "passthrough parsing" with stream-json like https://www.npmjs.com/package/bfj could do (even if archived for a few months)?

It feels weird to me to reimplement https://github.com/davidfou/stream-json-object/blob/main/src/streamObjectTransformer.mjs to have the rebuild of the object. Would feel like I'm the only one doing this?

cc @davidfou @uhop