Open martindurant opened 2 years ago
Thanks for posting this @martindurant !
I like the idea of starting with a project roadmap, and then having that flow down into organized projects as boards on GH. Giving better visibility into the direction of dask, timing for new features, would (I think) improve adoption. It also wouldn't hurt to provide better clarity on how new users (or organizations) can make proposals for changes and/or new capabilities -- this would likely fall under the heading of Improvement Proposals.
I think the Spark project improvement proposals (SPIP) process works well. It's a great way to give a contributor a signal if a big individual undertaking is worth the effort.
I'd like to make a detailed dask-iceberg proposal and get feedback from the Dask community before doing a ton of development work. Perhaps I can make a "Dask project improvement proposal" via an issue in dask/dask to trial that process? Thoughts @martindurant?
I'd like to make a detailed dask-iceberg proposal and get feedback from the Dask community before doing a ton of development work. Perhaps I can make a "Dask project improvement proposal" via an issue in dask/dask to trial that process?
Note there was an earlier attempt at a repo for large project design docs earlier (https://github.com/dask/design-docs), which may overlap a bit with this.
One action item from the June 2022 Dask Developers meeting was to form a subgroup tasked with identifying proposals to improve product/project management for the Dask organization, where appropriate. This group is comprised of leaders of technical teams within organizations committed to the development of Dask, who can also commit resources to larger efforts. This team met on June 7, 2022 to discuss areas of interest, align on intended outcomes, and identify next steps.
Attendees:
Some current areas of interest that were discussed included:
I/O
dask-awkward
dask/dask
dask-kubernetes
including heterogeneous cluster environmentsHighLevelGraphs
dask/distributed
There are instances where multiple organizations have a shared interest in improving Dask's performance. These shared interests represent potential points for collaboration, which could be improved by more rigorous project/product management practices. In cases where efforts are small, or resources for the work reside within a single organization, the added complexity likely does not provide incremental value.
The group discussed ways to leverage learnings from the Python community, including product roadmaps like those of Xarray and NumPy. The consensus was that these roadmaps are formatted at a suitable level of abstraction, and creating such a roadmap for Dask is something that should be evaluated.
There is agreement that the added rigor of design documents and project boards would improve visibility into technical deliverables, timelines, and commitments, but acknowledgement that the rigor comes with flexibility tradeoffs. The group discussed a range of options from simple design documents with a required technical review by stakeholders that could be archived as part of larger project boards to formal Enhancement Proposals
similar to NEPs.
There is consensus that projects involving resources spanning multiple organizations would benefit from a shared project board, assuming a resource has responsibility to maintain the board. Assuming it were properly maintained, there is an interest in testing this approach using either dask-awkward
or HighLevelGraphs
.
There are number of concurrent discussions underway on this topic. The initial proposal would be to:
The intent would be to enable an ongoing conversation about priorities for all teams engaged in Dask development.
I did create one project, as promised: https://github.com/orgs/dask/projects/3 . Not been noticed so far.
Actually did look at that project (after it was posted in the Plasma PR) 🙂
@hayesgb I think it would also help to have a design doc (is work spans multiple repos) or high-level issue (if work is contained to a single repo) to capture the work.
There have been some conversations going on around how to improve communication among the dev team, to be able to prioritize work and actually plan for and execute larger plans. The idea is to reduce friction, and ideally we get consensus on a few action items, to move dask forward
Some suggestions of things that might help. This list will be expanded as thoughts come in.
Some examples for discussion follow. How might the ideas above have helped or yet help these situations? These all have long conversations in issues that could be linked.
@dask/maintenance , all encourages to have a say. Time is also set aside at the end of this week's community call.