datafusion-contrib / datafusion-python

Python binding for DataFusion
https://arrow.apache.org/datafusion/python/index.html
Apache License 2.0
59 stars 12 forks source link

Implement PyArrow Dataset TableProvider #59

Open kylebrooks-8451 opened 2 years ago

kylebrooks-8451 commented 2 years ago

This implements a PyArrow Dataset TableProvider that allows for using Datasets as tables in Datafusion.

Fixes #10

wjones127 commented 2 years ago

Also FYI we are moving this repo, so they might ask to re-open this PR at the new location soon. https://github.com/apache/arrow-datafusion-python/pull/5

wjones127 commented 2 years ago

cc @andygrove

andygrove commented 2 years ago

Thanks @kylebrooks-8451 this is looking very cool. As @wjones127 mentioned we are just in the process of moving development to https://github.com/apache/arrow-datafusion-python so would you mind opening the PR there?

kylebrooks-8451 commented 2 years ago

Thanks @kylebrooks-8451 this is looking very cool. As @wjones127 mentioned we are just in the process of moving development to https://github.com/apache/arrow-datafusion-python so would you mind opening the PR there?

Thanks @andygrove! This PR has been moved to apache/arrow-datafusion-python#9. We can close this PR if you wish. I'm unable to make sure this works with DataFusion 10.0.0 because of this error:

error: failed to select a version for the requirement `parquet = "^18.0.0"`
candidate versions found which didn't match: 15.0.0, 14.0.0, 13.0.0, ...