Open ximonsson opened 1 month ago
This looks like a bug in our change data feed handling. Prior to 0.19.0 the Rust crate did not support CDF on the merge operation, which will be triggered by the enableChangeDataFeed
table property.
:disappointed:
This looks like a bug in our change data feed handling
it's dying here but this function is straight forward except logic 😿
If the partition column is the last column this issue doesn't seem to pop up. With some debug printing it looks like the schemas of preimage
and postimage
have the partition column in a different spot here https://github.com/delta-io/delta-rs/blob/2498837ff6a2c3525058f1a9fd1301ba50fecbba/crates/core/src/operations/cdc.rs#L49
For example, adding
println!("{}", preimage.schema());
println!("{}", postimage.schema());
before this line and running the above Python script prints
fields:[t.name, t.bday, t.weapons, t.magic, _change_type], metadata:{}
fields:[magic, name, bday, weapons, _change_type], metadata:{}
before the error.
Not sure why the union would even go through with ints and strings getting mixed, but not if there are timestamps or arrays in the mix 🤔 Ofc I could just be completely off base and this column moving isn't even related to the issue.
Environment
Delta-rs version: 0.19.1
Binding: python
Environment:
Bug
What happened:
This error seems to have been introduced in 0.19. An error is thrown when trying to merge into a table that is partitioned and has the
delta.enableChangeDataFeed
configuration set to true, and if any of the columns have a non-primitive type, e.g. timestamp or array.What you expected to happen:
Would be niced if there was no error I guess.
How to reproduce it:
This produces the following error:
Removing either
partition_by
in the first write, thedelta.enableChangeDataFeed
or the columnsbday
andweapons
fixes it.More details:
I tried using version 0.18 and there it works.