What happened:
_internal.DeltaError: Generic DeltaTable error: Missing partition column: failed to parse when using pyarrow DictionaryArray as partition column for write_deltalake.
What you expected to happen:
Successful write.
How to reproduce it:
import pyarrow as pa
from deltalake import write_deltalake
# pyarrow.lib.DictionaryArray
array = pa.array(["a", "b", "c"], type=pa.dictionary(pa.int8(), pa.string()))
data = {
"foo": [1, 2, 3],
"bar": [1, 2, 3],
"baz": array,
# "baz": ["a", "b", "c"], # using this instead works
}
table = pa.table(data)
# write to partitioned delta table
write_deltalake("my_delta_table", table, partition_by="baz")
# _internal.DeltaError: Generic DeltaTable error: Missing partition column: failed to parse
More details:
Traceback (most recent call last):
File "/home/j/repos/dlt/mre.py", line 16, in <module>
write_deltalake("my_delta_table", table, partition_by="baz")
File "/home/j/.cache/pypoetry/virtualenvs/dlt-2tG_aB2A-py3.9/lib/python3.9/site-packages/deltalake/writer.py", line 323, in write_deltalake
write_deltalake_rust(
_internal.DeltaError: Generic DeltaTable error: Missing partition column: failed to parse
Environment
Delta-rs version: 0.21.0
Binding: python
Environment: local, WSL2, Ubuntu 24.04.1 LTS
Bug
What happened:
_internal.DeltaError: Generic DeltaTable error: Missing partition column: failed to parse
when using pyarrowDictionaryArray
as partition column forwrite_deltalake
.What you expected to happen: Successful write.
How to reproduce it:
More details: