delta-io / delta-rs

A native Rust library for Delta Lake, with bindings into Python
https://delta-io.github.io/delta-rs/
Apache License 2.0
1.98k stars 365 forks source link

generate more sensible row group size #2545

Open djouallah opened 1 month ago

djouallah commented 1 month ago

delta_rs seems to generate very small row group around 65K rows, it will be nice to have a bigger size by default, I understand there is no standard for that but the current behavior is too conservative.

sherlockbeard commented 1 day ago

you can control this behavior for pyarrow engine using min_rows_per_group, max_rows_per_group