voltrondata-labs / arrowbench

R package for benchmarking
Other
13 stars 9 forks source link

Test impact of row group size on large dataset #86

Open alistaire47 opened 2 years ago

alistaire47 commented 2 years ago

74 shows there is an impact to different row group sizes in Parquet files, but the effect is not huge on files of the scale of the available sources. This task is to go resave the full taxi dataset with different row group sizes and see what impact it has when a bunch of filters that pull from some row groups but not others are passed.