trygve-lie / through-batch

Transform stream that buffer up object in the stream and pushes the buffers as batches further down the stream.
4 stars 1 forks source link

Example writing to the same batch file #116

Open marcgreenstock opened 7 years ago

marcgreenstock commented 7 years ago

Hi, I would like to try this library, but I am failing to understand how to propagate multiple write streams.

The example:

fs.createReadStream('objects.json')
  .pipe(JSONStream.parse('*'))
  .pipe(batch(4))
  .pipe(JSONStream.stringify())
  .pipe(fs.createWriteStream('batchedObjects.json'));

will write multiple arrays to the same file batchedObjects.json causing it to be an invalid JSON file.

Would it be possible to create an example with multiple write streams? I don't see anywhere in batch.js that retains a record of the number batches.

trygve-lie commented 7 years ago

Looking at it, the main example are not the best. I think I'll write a better one.

This module will not handle multiple streams. The main purpose of this module is to take a stream where each event in the stream are an object and buffer these up into a batch of objects.

Example; CouchDB has a feature where one can write multiple documents (JSON) in batches. Its much faster to POST ex 100 documents in one batch to CouchDB than posting 100 documents one by one. If you have stream of documents you want to post to CouchDB you can then use this module to buffer up documents and send them in batches to CouchDB. The code will look something like this (not a fully working example, but thats the outline):

const nano = require('nano')('http://localhost:5984');
const stream = require('stream');

const db = nano.db.use('myDb');

fs.createReadStream('objects.json')
  .pipe(JSONStream.parse('*'))
  .pipe(batch(100))
  .pipe(new stream.Writable({
      write(docs, encoding, cb) {
            db.bulk(docs, null, cb);
      }
    }));

Thats the outline of it.

What is it your are trying to solve?