datafusion-contrib / datafusion-python

Python binding for DataFusion
https://arrow.apache.org/datafusion/python/index.html
Apache License 2.0
59 stars 12 forks source link

Should we move this library back to Apache Arrow governance? #58

Open andygrove opened 2 years ago

andygrove commented 2 years ago

datafusion-python was donated to the Apache Arrow project in April 2021 and was added to the arrow-datafusion repository [1].

datafusion-python was removed from the repository in January 2022 [2] and added to a new repository in the datafusion-contrib organization.

I would like to propose bringing the Python bindings back under Apache governance. This will require going through the IP clearance process again, unfortunately.

I propose that we move the code to its own repository, perhaps apache/arrow-datafusion-python?

Let's use this issue and the mailing list to discuss this.

[1] https://github.com/apache/arrow-datafusion/pull/69

[2] https://github.com/apache/arrow-datafusion/pull/1518

alamb commented 2 years ago

I am curious about the rationale to bring it back into apache governance (I am not opposed, but I wonder what the benefits are). Is it related to finding more assistance to maintain the code? Or is it cumbersome to keep up with non trivial changes in DataFusion?

andygrove commented 2 years ago

One reason is that I would like to help maintain the package. The proposal is not to bring it back into arrow-datafusion but into its own repo arrow-datafusion-python. I don't foresee the move having any impact on DataFusion maintainer's workload.

alamb commented 2 years ago

One reason is that I would like to help maintain the package.

I see -- as I recall your employment situation requires ASF governed projects, correct?

BTW the move makes sense to me

matthewmturner commented 2 years ago

@andygrove only concern that i have with this is if in the future you were unable to contribute as much to the python bindings that the maintenance burden would fall to other arrow contributors who may have less motivation to maintain the python bindings which could slow down progress.

andygrove commented 2 years ago

I see -- as I recall your employment situation requires ASF governed projects, correct?

It's not quite that simple. There is a process to go through before committing to an open source project and that has already been done for ASF projects so the path is much simpler. In other cases, I have had to add company copyrights to code being submitted when the governance of the project is less clear.

andygrove commented 2 years ago

@andygrove only concern that i have with this is if in the future you were unable to contribute as much to the python bindings that the maintenance burden would fall to other arrow contributors who may have less motivation to maintain the python bindings which could slow down progress.

Yes, that is a good point. I suppose there is always the option to move it back out again but I would prefer to see us work towards having more committers on the project.