ironSource / parquetjs

fully asynchronous, pure JavaScript implementation of the Parquet file format
MIT License
349 stars 174 forks source link

Streaming read or from Buffer #111

Closed muratcorlu closed 4 years ago

muratcorlu commented 4 years ago

I need to read many parquet files from S3 bucket in an AWS lambda function. I see it's possible to write files to streams but not possible for reading. Do we have any workaround to read files from S3 buckets without writing them to local disk?

I saw #28 about reading as base64 and comments are pointing ParquetTransformer as solution, but as far as I see it's only for writing not for reading.

muratcorlu commented 4 years ago

I started to use fork of @ZJONSSON's fork https://github.com/ZJONSSON/parquetjs It has reader for S3 and also reading from Buffer capabilities.