DyfanJones / RAthena

Connect R to Athena using Boto3 SDK (DBI Interface)
https://dyfanjones.github.io/RAthena/
Other
35 stars 6 forks source link

Added snappy compression to parquet uploaded files #26

Closed DyfanJones closed 4 years ago

DyfanJones commented 4 years ago

Parquet snappy compression: https://arrow.apache.org/docs/r/reference/write_parquet.html

To reduce cost in AWS Athena queries, compression is required. From current understand snappy compression for parquet is standard.

Were gzip compression might give small files (better for cost), snappy compression is better for performance. This should give the best for both worlds, compressed files but can be queried effectively.

DyfanJones commented 4 years ago

Dev version 1.4.0.9000 now supports snappy compression