delta-io / delta-rs

A native Rust library for Delta Lake, with bindings into Python
https://delta-io.github.io/delta-rs/
Apache License 2.0
1.97k stars 365 forks source link

Arrow: Parquet does not support writing empty structs when creating checkpoint #2622

Open echai58 opened 4 days ago

echai58 commented 4 days ago

Environment

Delta-rs version: 0.18.1

Binding: python


Bug

What happened: When trying to create a checkpoint on a table with one non-partition column of binary type, I get the error:

OSError: Arrow: Parquet does not support writing empty structs

I found this PR https://github.com/delta-io/delta-rs/pull/2125, whose intention seemed to be to fix exactly this, but doesn't seem to be the case.

What you expected to happen: To be able to create the checkpoint.

How to reproduce it:

from deltalake import DeltaTable, write_deltalake
import pandas as pd

write_deltalake(
    "test",
    pd.DataFrame.from_dict(
        {
            "p": [1],
            "k": [b'a'],
        }
    ),
    partition_by=["p"],
)

DeltaTable("test").create_checkpoint()