Closed MrPowers closed 6 months ago
I exposed the predicate parameter for the rust engine writer but it's currently not doing anything because the functionality in Rust is not built yet
take
I'll give this a try
WriteBuilder uses predicate: Option<String>
but has no implementation for it yet whereas DeleteBuilder uses predicate: Option<Expression>
. I suggest harmonising by changing WriteBuilder to use predicate: Option<Expression>
. Though this is a breaking change, predicate handling is not implemented in WriteBuilder so changing the type should not cause issues
It would be great to do this usig logical expressions rather then the physical ones - much like @Blajda recently updated for merge. The good thing there is we get some type coercion for free, which has been a hassle with expressions.
In python we will likely have to accept strings and do the parsing..
@roeap I think we can start allowing arrow expressions as input, which we can serialize as substrait and then deserialize with Datafusion-substrait
This would be a great goal, but I would say lets be consistent in that and make a deliberate API choice.
I.e not have substrait supported in one method but not the other...
Good news is substrait plans are of course logical plans :)
I'll try that @roeap. As for
It would be great to do this usig logical expressions rather then the physical ones - much like @Blajda recently updated for merge.
is this the David's PR you are referring to? https://github.com/delta-io/delta-rs/pull/1969
@roeap we should be able to add this to merge, update, delete and write and then just add the conversion inside the pyo3 binding, so it's a Python only feature.
@r3stl355 its #1720 had been up for a while before it got merged.
@ion-elgreco - sure to get started, and as you said right now this could just be internal. Substrait is a nice feature for rust as well, of course as alternative path since we are lookig to integrate into datafusions internal planning.
Description
PySpark has a cool
replaceWhere
function that lets you override existing data in a Delta table that matches a predicate with new data. Here's an example of thereplaceWhere
functionality:What do folks think about adding
replaceWhere
functionality to Python deltalake.It's possible that the Rust
predicate
argument inwrite_deltalake
already exposes this functionality.