jimhigson / oboe.js

A streaming approach to JSON. Oboe.js speeds up web applications by providing parsed objects before the response completes.
http://oboejs.com
Other
4.77k stars 209 forks source link

Possible to stream chunks of JSON data? #222

Open timendez opened 3 years ago

timendez commented 3 years ago

Hi, I was consuming a 25gb JSON file (array of objects) and I was profiling my code to try and improve performance:

let second = false;
oboe(myDataStream)
    .node('!.*', myObject => {
      if (second) {
        console.timeEnd('stream');
        process.exit();
      }
      // business logic in here...
      second = true;
      console.time('stream');
      return oboe.drop;
    })

and it takes about 1ms on average per object, which is the bottleneck of my code.

Is there a way where I can have .node send me a chunk of like 500 objects? Or would that not improve performance anyway since technically oboe has already done a large chunk of IO?

Thanks!

paulsmithkc commented 3 years ago

I would be interested in any chunking methods as well.

PS: Although they should be identical "![*]" and "!.*" seem to process nodes at different speeds.

paulsmithkc commented 3 years ago

Managed to implement chunking, and it did make things much faster.

let chunk = [];
const showChunk = () => {
  for (const item of chunk) {
    // show the item
  }
  chunk = [];
};

oboe('/api/stream')
  .node('![*]', item => {
    if (item) {
      chunk.push(item);
      if (chunk.length >= 1000) {
        showChunk(chunk);
      }
    }
    return oboe.drop;
  })
  .done(_ => {
    // show the last chunk
    showChunk(chunk);
  })
  .fail(res => {
    // show the error
  });