OS: Mac 10.15.7 and Debian GNU/Linux 12 (bookworm)
Other: DBR 12.2 LTS, Spark 3.3.2, python 3.11
Bug
What happened:
Table with struct column 'gps_extended_signal' created via spark
Data merged successfully using python DeltaTable .merge() with pyarrow RecordBatch
Modified struct column, adding 3 extra fields via spark
Modified python code to include struct's new fields in pyarrow RecordBatch
Attempts to subsequently merge new or existing data fail with:
File "/usr/local/lib/python3.11/site-packages/deltalake/table.py", line 1800, in execute
metrics = self.table._table.merge_execute(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
_internal.DeltaError: Generic DeltaTable error: External error: Arrow error: Invalid argument error: arguments need to have the same data type
What you expected to happen:
Data merged successfully after modifying table struct
ALTER TABLE dwh.frame_test ADD COLUMNS (gps_extended_signal.position_std_up DOUBLE AFTER undulation, gps_extended_signal.position_std_north DOUBLE AFTER undulation, gps_extended_signal.position_std_east DOUBLE AFTER undulation)
Merge code (after modifying gps_extended_signal struct to add positionstd* fields):
Environment
Delta-rs version: deltalake==0.17.4
Binding: python
Environment:
Bug
What happened:
What you expected to happen: Data merged successfully after modifying table struct
How to reproduce it: Table:
Alter table:
Merge code (after modifying gps_extended_signal struct to add positionstd* fields):
More details: