Closed liamphmurphy closed 3 months ago
Does this only manifest with the schema evolution? Or are you able to see errors with append or merge writes as well?
Does this only manifest with the schema evolution? Or are you able to see errors with append or merge writes as well?
It happens at any operation when there is concurrency and the state gets updated at the end
Environment
Delta-rs version: python v0.16
Binding: ^^
Environment:
Bug
What happened:
To test the rust engine, we cleared out any existing delta tables in our nonprod environment and switched from pyarrow over to the rust engine with schema merging, with this
write_deltalake
call:Despite it being a brand new Delta table and after some successful writes, eventually the lambdas started erroring with
Generic DeltaTable error: Version mismatch
. I believe the error is coming from here: https://github.com/delta-io/delta-rs/blob/3e6a4d61923602d189f559636b3e3e3f61b6a924/crates/core/src/table/state.rs#L192What you expected to happen:
Especially since we are testing with a fresh table, I'd expect all writes to work (and not just some) even with the new schema merge flag set.
How to reproduce it: I was not able to reproduce with a randomly generated dataset locally, so my guess is its something more to do with the dynamo locking on S3 If you have thoughts on how I could test this better, please let me know.
Note that we have roughly 10 concurrent lambdas that could potentially write to Lambda. However, before this change we had 50 writing with pyarrow and all was well.