moleculerjs / moleculer-web

:earth_africa: Official API Gateway service for Moleculer framework
http://moleculer.services/docs/moleculer-web.html
MIT License
291 stars 118 forks source link

Support for streaming large JSON object #248

Closed rishighan closed 3 years ago

rishighan commented 3 years ago

I have a large JSON, potentially tens of thousands of objects coming from an API. I have a molecular based microservice reading from a source directory and writing metadata to a JSON object.

The relevant action in my ComicBook service:

//...

getComicCovers: {
    rest: "POST /getComicCovers",
    params: {
        extractionOptions: "object",
        walkedFolders: "array",

    },
    async handler(
        ctx: Context < {
            extractionOptions: IExtractionOptions;
            walkedFolders: IFolderData[];
        } >
    ) {
        return await getCovers(
            ctx.params.extractionOptions,
            ctx.params.walkedFolders
        );

    },
},

//...

which returns JSON structured like so:

[
// ... tens of thousands of these
   {
      "fileSize":642292,
      "name":"before-watchmen-comedian-02-01.jpg",
      "path":"./userdata/expanded/Comedian 002"
   }
//...
]

On my client-side I have this in the Express-based facade:

router.route("/getComicCovers").post(async (req: Request, res: Response) => {
    typeof req.body.extractionOptions === "object" ?
        req.body.extractionOptions :
        {};
    const comicBookCoversData = await axios({
        url: "http://localhost:3000/api/import/getComicCovers", // <--- moleculer microservice endpoint
        method: "POST",
        data: {
            extractionOptions: req.body.extractionOptions,
            walkedFolders: req.body.walkedFolders,
        },
    });
    const stream = new Readable({
        objectMode: true,
        highWaterMark: 1,
        read() {},
    });

    const ndjsonStream = through2({
            objectMode: true,
            highWaterMark: 1
        },
        (data, enc, cb) => {
            cb(null, JSON.stringify(data) + "\n");
        },
    );

    stream.pipe(ndjsonStream).pipe(res);
    stream.push({
        data: comicBookCoversData.data
    });
    stream.push(null);
});

In summary, on the Express side, the response is modified in a duplex transform stream, which converts it into ndjson and pipes that into the response stream.

My problem is that the response is not streamed, it is buffered and then I get the entire array of objects. I expected the JSON to streamed a few objects at a time, as they become available.

I am not sure what I am missing, from either on the moleculer side or the express side.

icebob commented 3 years ago

Moleculer services respond the data as you pass. If it's a big javascript object, it will be serialized and transferred. If you want to respond the data as a stream, you should create the stream instance in the action and return it.

rishighan commented 3 years ago

So you suggest moving my Express transform stream code to the action in Moleculer?

rishighan commented 3 years ago

I am using a highland stream like so:

getComicCovers: {
    rest: "POST /getComicCovers",
    params: {
        extractionOptions: "object",
        walkedFolders: "array",

    },
    async handler(
        ctx: Context < {
            extractionOptions: IExtractionOptions;
            walkedFolders: IFolderData[];
        } >
    ) {
        const comicBookCoversData = await getCovers(
            ctx.params.extractionOptions,
            ctx.params.walkedFolders
        );
        return H(comicBookCoversData) // highland stream
            .through(stringify);
    },
},

Is it sufficient to return it this way in order to stream the JSON objects contained in it?

More generally, what is the pattern to stream a JSON response from an action? Are there any examples in the moleculer world?

icebob commented 3 years ago

I don't know highland but if it's a native Nodejs stream, it will work.

rishighan commented 3 years ago

Even if I were to use a node stream, how would I pipe it into the response?

icebob commented 3 years ago

just simply return with the stream. E.g.

getFile(ctx) {
  return fs.createReadStream("./my-file.json");
}
rishighan commented 3 years ago

Gotcha! thanks @icebob. Closing this issue out!