hildjj / node-cbor

Encode and decode CBOR documents, with both easy mode, streaming mode, and SAX-style evented mode.
MIT License
356 stars 73 forks source link

Question: How does one use stream based approach? #154

Closed DaveStein closed 3 years ago

DaveStein commented 3 years ago

I have a very large dataset and encodeAsync is now failing for me. I see the docs say "As with the other static encode functions, this will still use a large amount of memory. Use a stream-based approach directly if you need to process large and complicated inputs."

I am not sure what that entails. Are there any samples I can look at? All the samples seem to use .encode or .encodeOne directly. Couldn't find any async examples.

Thanks for any help!

hildjj commented 3 years ago

Something like:

const e = new Encoder();
e.pipe(process.stdout);
e.write({
  someObject: "Encoder is in object mode on the input side"
})
DaveStein commented 3 years ago

Thanks for quick reply! So basically I just write my data object and use pipe to output the stream as I want to another location? How is that better performance? I feel like I’d just pipe to a buffer which is essentially what encode async was doing.

DaveStein commented 3 years ago

PS: I am fairly clueless so I am genuinely curious about performance difference

hildjj commented 3 years ago

The goal is to not have it all in memory at the same time. If the destination stream is pulling data off as quickly as it can, you shouldn't get to the highwater mark and have the write block. That reminds me, if you're writing big chunks of data, you probably want this also:

const e = new Encoder({
  highWaterMark: 65536 // some number that is large enough to hold your largest chunk, plus a little for the cbor wrapper.
})
DaveStein commented 3 years ago

Ah okay that makes sense actually. Right now I’m writing to file at end but you’re saying write to file as I go, so I only ever have one chunk at a time. That makes sense to me.

hildjj commented 3 years ago

Yes.

const output = fs.createWriteStream('myFile')
encoder.pipe(output)
DaveStein commented 3 years ago

Will be trying when I get into work tomorrow :) I’ll close out when I verify.

DaveStein commented 3 years ago

I have to close because I'm in a weird environment that doesn't have that method. But I trust this would work.