Open colin-ho opened 11 hours ago
Comparing colin/gen-parquet
(85fd788) with main
(ec39dc0)
✅ 17
untouched benchmarks
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 77.35%. Comparing base (
3394a66
) to head (85fd788
). Report is 4 commits behind head on main.
🚨 Try these New Features:
1 reviewer was added to this PR based on Andrew Gazelka's automation.
When you specify a
num_parts
parameter when generating tpch files. It will first generatenum_parts
CSVs, then read those CSVs and write to parquet using Daft.However,
write_parquet
will not respect the input number of files, e.g. even if there are 16 input files there might only be 1 output file.The fix here is to read and write 1 file at a time.