dominictarr / JSONStream

rawStream.pipe(JSONStream.parse()).pipe(streamOfObjects)
Other
1.91k stars 165 forks source link

Bad performance on JSONStream.stringify #131

Closed nejcokorn closed 6 years ago

nejcokorn commented 7 years ago

I have tried using JSONStream.stringify() and noticed the performance actually bad. It takes 100x time to do the same thing I would do without streaming.

Please see the attached code below using JSONStream.

const fileSystem = require( "fs" );
const JSONStream = require( "JSONStream" );
const zlib = require('zlib');
const gzip = zlib.createGzip();

// Set timer
console.time("Timer");

var transformStream = JSONStream.stringify('[\n', ',\n', '\n]\n');
var outputStream = fileSystem.createWriteStream( __dirname + "/JSONStream.json" );

transformStream.pipe( outputStream );

for(var i=0; i<(250 * 18); i++){
    transformStream.write({
        child: Math.random().toString(36).substring(7),
        parent: Math.random().toString(36).substring(7),
        propertyName: Math.random().toString(36).substring(7),
        provertyValue: Math.random().toString(36).substring(7)
    });
}
transformStream.end();

outputStream.on(
    "finish",
    function handleFinish() {
        console.timeEnd("Timer");
        // Timer: 115267.364ms
    }
);

Using memory

const fs = require( "fs" );

// Set timer
console.time("Timer");

var outputStream = fs.createWriteStream( __dirname + "/nostream.json" );

var members = [];
for(var i=0; i<(25000 * 18); i++){
    members.push({
        child: Math.random().toString(36).substring(7),
        parent: Math.random().toString(36).substring(7),
        propertyName: Math.random().toString(36).substring(7),
        provertyValue: Math.random().toString(36).substring(7)
    });
}

outputStream.write(JSON.stringify(members));
outputStream.end();

outputStream.on(
    "finish",
    function handleFinish() {
        console.timeEnd("Timer");
        // Timer: 1267.696ms
    }
);

Is this expected performance?

dominictarr commented 7 years ago

Vary the parameters and see what affects it. Is lots of small stringifies concatenated faster than one big one? It wouldn't surprise me if one big one was faster, but then it also wouldn't surprise me if you got a non-linear slowdown once you got an input that was too big for JSON.stringify()

Streaming isn't actually about being faster, it's about using less resources at any one time. for example, JSONStream.parse is much slower, but you can parse objects with it that would cause out of memory errors with JSON.parse

Possibly you could do something clever here that used JSON.stringify in batches and get the best of both worlds (would be very happy to merge a PR for this), but we don't even have any performance tests currently.

dhruvdutt commented 6 years ago

I think this has been fixed. For me, the streaming is resulting in 40ms and non-stream version is taking > 1000ms.