goodwillpunning / hyperleaup

Create and manipulate Tableau Hyper files from Apache Spark DataFrames and Spark SQL
Other
29 stars 19 forks source link

Add creation_mode for LARGEFILE #36

Closed dustinvannoy-db closed 1 year ago

dustinvannoy-db commented 1 year ago

Add LARGEFILE creation mode option to improve performance when creating a Hyper file on Apache Spark for a large dataset. It does this by saving multiple files and using an ARRAY of Parquet file paths in the COPY command. Note: This only works on Databricks using dbfs.

goodwillpunning commented 1 year ago

LGTM! Thank you for the code contrib, @dustinvannoy-db!