Closed andygrove closed 2 years ago
@Jimexist Does it make sense to add Ballista support here or should we have a separate ballista-python
repo that somehow re-uses parts of datafusion-python
?
In my mind it makes sense to make a separate ballista-python as I view datafusion-python to be it's own standalone system just as ballista is. However I acknowledge there may be significant overlap. That being said there's been a lot of work lately on ballista which will likely continue so it could be a good time to decouple them for the purpose of Python bindings.
Thanks for the input ... I will close this issue and start a new repo and copy and paste much from this repo for now
@andygrove this is my take at the ballista-python crate, any suggestions? https://github.com/nl5887/ballista-python
Hi @nl5887 thanks for working on this! I think you could PR this directly into the arrow-ballista repo into a top-level Python folder. I would love to try this out but I am not a Python expert so may need some help. I followed the instructions in "How to develop" and it looks like everything installed but I am not sure how to import the project in the Python repl.
python
Python 3.9.5 (default, Jun 4 2021, 12:28:51)
[GCC 7.5.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import ballista
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'ballista'
>>> import datafusion
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/andy/git/personal/ballista-python/datafusion/__init__.py", line 29, in <module>
from ._internal import (
ModuleNotFoundError: No module named 'datafusion._internal'
Could be that I still had some left-overs from the old python datafusion project. I'll rename everything to ballista, make sure both datafusion and ballista modules can co-exist and do some additional cleanup. When ready will make a PR for arrow-ballista repo. Thanks!
@andygrove just pushed latest code to my repo (https://github.com/nl5887/ballista-python). This should work for you.
@nl5887 I just tried it and it works beautifully :heart:
I had also missed a step in my previous attempt ... I can't wait to see this in Ballista! Thanks so much for working on this.
I would like to be able to execute queries against Ballista from Python.
I think this is just a case of adding a new
PyBallistaContext
class.This should be an optional feature, disabled by default.