delta-io / delta-rs

A native Rust library for Delta Lake, with bindings into Python
https://delta-io.github.io/delta-rs/
Apache License 2.0
1.98k stars 365 forks source link

Rust writer not encoding correct URL for partitions in delta table #2634

Open gprashmi opened 4 days ago

gprashmi commented 4 days ago

Environment

Delta-rs version: 0.17.4


Bug

We write data to delta table using delta-rs with PyArrow engine with DayHour as partition column. However when we run the optimize.compact() on the table, it creates partitions with spaces and does not properly encode the partition urls as shown in the below image i.e; it creates new partitions url with spaces (.zstd.parquet).

image

Can you please let me know how we can run the optimize.compact without having partitions with spaces?

gprashmi commented 4 days ago

@g12-al

g12-al commented 2 days ago

Confirmed that this also seems to happen in 0.18.0.

This breaks compatibility for our Trino connector to enable visualization on dashboards. Ideally, Trino wouldn't care about spaces (it has a lot of other issues like not being compatible with timezone-aware timestamps).