apache / datafusion-ballista

Apache DataFusion Ballista Distributed Query Engine
https://datafusion.apache.org/ballista
Apache License 2.0
1.39k stars 181 forks source link

Add support for Jupyter notebooks in Ballista #42

Open andygrove opened 3 years ago

andygrove commented 3 years ago

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

Jupyter notebooks are a popular tool for interacting with data science and "Big Data" systems and it would be nice if we could add support for this in Ballista.

Describe the solution you'd like

It probably makes sense to support writing Python code in the notebook, so this would depend on having Python bindings available for DataFusion & Ballista (see https://pypi.org/project/datafusion/).

Describe alternatives you've considered None

Additional context None

drauschenbach commented 3 years ago

I've investigated Python and JavaScript notebooks for Ballista, and think that Rust notebooks would be the low-hanging fruit and natural starting point:

https://github.com/google/evcxr/blob/main/evcxr_jupyter/README.md

EvCxR supports one-liner crate imports, and that's how the Ballista Client could be integrated, once apache/arrow-datafusion#509 and apache/arrow-datafusion#234 are addressed.

Since EvCxR already exists, maybe this issue should be renamed to focus more specifically on a Python-specific Jupyter kernel.

jorgecarleitao commented 3 years ago

An alternative is to add support to Ballista to the Python bindings. I am not sure about the benefit of Rust itself in the notebook; in the notebook people usually favor interpreted languages, that do not have the compiling overhead.

An interface common for DataFusion and Ballista context, e.g. via a trait, and use Box<dyn Context> on the Python binding to use one or the other should be enough?

xudong963 commented 2 years ago

FYI, databend has supportted https://databend.rs/doc/integrations/gui-tool/jupyter#what-is-jupyter-notebook :)

nl5887 commented 2 years ago

Related to #58