dominictarr / JSONStream

rawStream.pipe(JSONStream.parse()).pipe(streamOfObjects)
Other
1.91k stars 165 forks source link

JSONStream.parse() not working on large files #151

Closed martalopes closed 6 years ago

martalopes commented 6 years ago

I am using this to read a json file and print it on the cmd. When I use a small file it works but when I use a really big file nothing appears on the command line but it doesn't give me any error. Is it because I have to wait for the parse to end? How do I make this work?

var fs = require('fs'),
    JSONStream = require('JSONStream');

var stream = fs.createReadStream('test.json', {encoding: 'utf8'}),
    parser = JSONStream.parse(),

stream.pipe(parser);

parser.on('data', function(data) {
  console.log('received:', data);
});
doowb commented 6 years ago

@martanlopes what does the data in the json file look like?

You'll need to pass a pattern into .parse() that will pull out individual items from your json file. The examples in the README basically have an array called rows that has objects with a doc property:

{
  "rows": [
    {"id": 1, "doc": { ... }},
    {"id": 2, "doc": { ... }},
    {"id": 3, "doc": { ... }},
    {"id": 4, "doc": { ... }}
  ]
}

The pattern used is JSONStream.parse('rows.*.doc') which says, "give me the property doc from each row". This is what the value of data is in the parse.on('data', ...) callback.

Since you're not providing a pattern, JSONStream is attempting to read the entire file at once and return it, this is similar to doing var data = require('./test.json'); If the file is really big, then it will take a long time and you may run out of memory.

I hope this helps.