apache / datafusion-ballista

Apache DataFusion Ballista Distributed Query Engine
https://datafusion.apache.org/ballista
Apache License 2.0
1.4k stars 182 forks source link

[EPIC] Add support for Substrait #32

Open andygrove opened 2 years ago

andygrove commented 2 years ago

[EDIT: Updated this on 2/25/23]

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

The substrait standard is gaining adoption and I would like to add support to Balllista. There are three different areas where we could potentially support Substrait:

Original description:

Is your feature request related to a problem or challenge? Please describe what you are trying to do. Ballista (and DataFusion) has a proprietary protobuf-based format for serializing query plans. This really ties Ballista to DataFusion and does not allow other query engines and/or compute kernals to be used easily.

Describe the solution you'd like There is now an emerging standard for query plan serialization at https://substrait.io/ and this is also protobuf-based. It would be good to move towards this over time.

Describe alternatives you've considered None

Additional context None

andygrove commented 1 year ago

Substrait support is now in DataFusion, so I plan on working on this soon