CyrusOfEden / CSV.js

A simple, blazing-fast CSV parser and encoder. Full RFC 4180 compliance.
MIT License
1.54k stars 95 forks source link

Do an actual stream #5

Open calvinmetcalf opened 10 years ago

calvinmetcalf commented 10 years ago

So you can do fs.createReadStream().pipe(new Csv).pipe(outStream)

tburette commented 10 years ago

Thought the same. It would also be good for it to accept both string and Buffer

calvinmetcalf commented 10 years ago

streams. by default will coerce all the strings to buffers

CyrusOfEden commented 10 years ago

To be honest I never expected this—never worked with NodeJS—and came at this from the client-side perspective.

Looking into streams now.

tburette commented 10 years ago

I recommend this to learn node stream : http://nodeschool.io/#stream-adventure

calvinmetcalf commented 10 years ago

I would also recommend though making sure to use the streams2 api as that handles buffering much better then the streams1 api.

sindresorhus commented 10 years ago

https://www.npmjs.org/search?q=csv+stream

evanplaice commented 10 years ago

FYI, Using streams will break compatibility with browsers as they don't implement stream reading.

calvinmetcalf commented 10 years ago

It depends on how you implement it, if you use readable-stream and browserify it won't break comparability you could preserve the same api and just have it be a transform stream if an input isn't provided

CyrusOfEden commented 10 years ago

Would it be possible to enable/disable features dependent on the environment?

Similar to how at the end of the file (see below), we set up the exports, I'm thinking we could enable/disable features programmatically.

if (typeof define === "function" && define.amd) {
  define(CSV);
} else if (typeof module === "object" && module.exports) {
  module.exports = CSV;
} else {
  this.CSV = CSV;
}
evanplaice commented 10 years ago

Yes, but the two implementations will be drastically different. Where streams are surprisingly easy in Node.js, they're surprisingly difficult in the browser.

Node.js is capable of piping large inputs through the parser as a stream. The HTML5 File API is only capable of loading binary blobs in chunks*.

__Reading local files in JavaScript*

That means. At the end of each chunk the parser will need to arbitrarily pause mid-way, load another chunk, and continue parsing from any arbitrary point in the input data.

I explored this option in great depth when I was writing jquery-csv but the RegEx tokenizer I used can't be paused mid-stream. I'd have to basically do a full re-write of the parser core to make it work.