ZJONSSON / parquetjs

fully asynchronous, pure JavaScript implementation of the Parquet file format
MIT License
34 stars 61 forks source link

Is it possible to read first n rows? #46

Open kevincfz opened 4 years ago

kevincfz commented 4 years ago

I store my parquet files in S3, and would want to read first n rows, without having to read the entirety of my parquet files.

I understand that parquet is column-oriented, but is it possible to read row-wise?

muratcorlu commented 4 years ago

You are already reading with cursor, so it doesn't read entire file until you iterate on all of them. So this should be enough for your need:

let count = 0;

while (record = await cursor.next()) {
  count++;
  if (count == 100) {
    break;
  }
}