Closed alamb closed 4 months ago
See #10281 for example
Also posted to mailing list https://lists.apache.org/thread/199ymolos20sr9vvz5ctv6j2nnrgrbo2
Submitted following report:
## Description:
The mission of Apache DataFusion is the creation and maintenance of software
related to an extensible query engine
## Project Status:
Current project status: New + Ongoing (high activity)
Issues for the board: None
## Membership Data:
Apache DataFusion was founded 2024-04-16 (2 months ago)
There are currently 32 committers and 13 PMC members in this project.
The Committer-to-PMC ratio is roughly 2:1.
Community changes, past quarter:
- Ruihang Xia was added to the PMC on 2024-06-13
- Mehmet Ozan Kabak was added to the PMC on 2024-06-13
- Mustafa Akur was added to the PMC on 2024-05-09
- Oleks V. was added to the PMC on 2024-05-09
## Project Activity:
The project continues to be quite active with many PRs and issues opened and
closed per day.
We have mostly completed tasks related to becoming a new top level project
including an ASF press release[0] the new top level project and document ing
more thoroughly the process of inviting new committers and PMC members[1].
We also began discussing adopting the sql parser into the DataFusion ASF
governance process[2].
There are also several regional meetups planned: in San Francisco in June and
in China in July.
[0]: https://news.apache.org/foundation/entry/
apache-software-foundation-announces-new-top-level-project-apache-datafusion
[1]: https://github.com/apache/datafusion/pull/10778
[2]: https://github.com/sqlparser-rs/sqlparser-rs/issues/1294
### DataFusion core
https://github.com/apache/datafusion
We made our first successful release as a new project, version 38.0.0
In addition to the work related to moving to a top-level project, the
community continues to work on making logical planning faster, making function
packages (i.e. UDFs) modular and easier to mix/match, and “de-parsing” logical
plan expressions back to SQL, and improve type coercion.
Recently there has been renewed interest in reading parquet files and creating
secondary indexes.
### Sub project: DataFusion Python
https://github.com/apache/datafusion-python
The DataFusion Python subproject has become more active since the last board
report with contributions from several contributors. Version 37 was released,
and version 38 is in the process of being released
### Sub project: DataFusion Comet
https://github.com/apache/datafusion-comet
The Comet subproject has had face to face sync meetings which are recorded[1].
[1] https://lists.apache.org/thread/9kqxkpwxf4oxonfboyfh8j6ko7r3fb3z
The Comet subproject is very active and is receiving significant contributions
from new contributors. There is some initial documentation published at
https://datafusion.apache.org/comet/.
### Sub project: DataFusion Ballista
https://github.com/apache/datafusion-ballista
https://github.com/apache/datafusion-ballista-python
The Ballista subproject is not currently actively maintained.
### Recent Releases
* PYTHON-38.0.1 was released on 2024-05-30.
* PYTHON-37.1.0 was released on 2024-05-13.
* 38.0.0 was released on 2024-05-10.
## Community Health:
We have added several new committers and PMC members (see above) in the last
month, and we expect to continue to do so regularly. While it would always be
nice to have more bandwidth to devote to PMC activities, we are currently
doing well.
While most communications still happen through github, the mailing lists are
now fully active, as reflected in their metrics:
* dev@datafusion.apache.org had a big increase in traffic in the past quarter
(71 emails compared to 0)
* github@datafusion.apache.org had a big increase in traffic in the past
quarter (7685 emails compared to 0)
Is your feature request related to a problem or challenge?
Per https://whimsy.apache.org/roster/committee/datafusion the DataFusion ASF board report schedule is
March, June, September, December
Describe the solution you'd like
I would like to draft a board report for the ASF board meeting, ideally with community help.
The meetings are typically in the second or third week of the month
Describe alternatives you've considered
I plan to do this in the same style that worked well in Arrow (see an example from @andygrove here https://lists.apache.org/thread/7w4mgy98qomc6drvj2fo81gvhq6p0boc) -- make a google doc (or issue) that people can add relevant content to and then the chair (me for the time being) submits it to the board
Additional context
No response