Open noahfrn opened 2 years ago
CC @lwhite1 @davisusanibar
An alternative might be to accept Substrait plans instead, binding to Acero as a whole instead of just Dataset. This would give you the full power of the query engine and avoid having to create bindings to individual C++ components. But of course this is more complex and Substrait is a bit of a moving target.
Another alternative is to track the discussion about a text format for expressions. See https://lists.apache.org/thread/7vch27t3gfz1hmv7d8w69n50gfc1nswf and apache/arrow#14287.
Thanks for your help @lidavidm - that expression parsing PR looks like exactly what I wanted!
There is a discussion here about passing Substrait expressions to the Dataset Project and Filter methods: https://github.com/apache/arrow/issues/33985#issuecomment-1435319522
If this gets implemented in C++, can it be exposed to Java through the JNI bindings?
I think this is just about having a convenient user-facing API to the existing Dataset functionality, a Substrait API to Acero in Java would be a separate project
Ok, thanks. To clarify: my question is not about Substrait plans, it's about Substrait expressions which we now have a way to represent independent of plans. I opened a separate issue to request this feature: https://github.com/apache/arrow/issues/34252
I think without a convenient API to build Substrait expressions in Java, it'd still not quite meet the goals here right?
Agreed @lidavidm - Something like https://github.com/substrait-io/substrait-java/issues/128 would provide the necessary functionality here.
https://github.com/apache/arrow/issues/34252 is now complete, and the Arrow Java Datasets API now supports pushdown projection and filtering using Substrait expressions.
More details here: https://github.com/apache/arrow/blob/main/docs/source/java/dataset.rst#projection-produce-new-columns-and-filters
However: this capability is not really ready for practical applications yet, because we do not yet have any user-friendly tools to create Substrait expressions in Java. I hope we can achieve that in https://github.com/substrait-io/substrait-java/issues/128.
Push-down filtering (Java) questions
Hey all,
I'm exploring adding push-down filtering to the Java Datasets API, as it currently only supports push-down projection. Just had a couple of questions before I start this work:
Thanks!
Component(s)
C++, Java