dask / governance

The governance process and model for Dask
Creative Commons Zero v1.0 Universal
7 stars 10 forks source link

Resolving technical conflicts? #5

Closed jcrist closed 5 years ago

jcrist commented 5 years ago

How do we resolve technical conflicts? Ideally the governance model should never come into play, but we should write a model that prepares for large differences of opinion in technical direction for the project. How do we resolve these conflicts when developers disagree long enough that action needs to be taken?

pitrou commented 5 years ago

You might be interested in the survey that was done here in preparation for the Python governance vote: https://www.python.org/dev/peps/pep-8002/

In short, a frequent pattern is for projects to evolve away from a BDFL model to a more distributed model, which scales better and is more resilient. The latter often includes a supervision / overseeing body with varying decisional power depending on the project (often called "steering committee" though I like Astropy's "CoCo" aka coordination committee).

mrocklin commented 5 years ago

I think that today our defacto model (non-ideal) is that we chat for a while and usually come to a good compromise, otherwise I generally make unilateral decisions, except when something is particularly important at which point I try to build concensus (such as in developing the membership policy).

I'd like to see us move to a more balanced model. I also think that this might make things sluggish if we don't require some additional effort from the steering committee.

My experience has been that getting responses from the current set of owners takes about a week or two, even for simple issues like adding new members. I would be very happy to see operational decision-making diffuse away from myself, but I do think that this will require a greater engagement from others.

Also, I'd like to see whoever is on that steering committee take on a greater role of engaging with the community, generally through handling issues as they arise. This is, I think, the best way today to get a good sense of priorities for the project's userbase, which I think should be the final constituency.

jcrist commented 5 years ago

I think that day-to-day things will continue to operate informally, even in the face of disagreement. I'm mostly looking for a process here in case some decision becomes heated and the community is split. I think a steering committee approach would be good here, preferably with a rotating set of members over time.

Also, I'd like to see whoever is on that steering committee take on a greater role of engaging with the community, generally through handling issues as they arise. This is, I think, the best way today to get a good sense of priorities for the project's userbase, which I think should be the final constituency.

Agreed.

ogrisel commented 5 years ago

Rotation is nice but imposes to have a significant number of people that are involved in the project on the long term. So I think rotation should not be mandatory if no new volunteer candidates show up.

ogrisel commented 5 years ago
pitrou commented 5 years ago

Judging by the contribution dynamics in Dask, it seems starting with a 3- to 5-person steering committee should be doable.

I agree that imposing rotation requirements is risky in such a small community.

guillaumeeb commented 5 years ago

đź‘Ť on Leadership team for difficult decision or in case of community split.

Rotation should be encouraged (and forced if some member of the leadership team has not enough time to devote to the project), but not mandatory.

mrocklin commented 5 years ago

I'm against BDFL in principle, but I also want to add the constraint that people who have central authority also participate consistently in central maintenance. I think that constraint, if adopted, reduces our candidate pool considerably today, and so we should continue to embrace a more anarchic model (like the SciPy/PyData ecosystem).

To be clear, many people do a ton of work on Dask, but they usually focus on a subproject (like dask.array or dask-kuberentes) or their involvement is intermittent and they disappear for months at a time.

Today I think that I review the vast majority of PRs and handle the vast majority of issues to the core dask projects. I'll estimate this at 75% +- 10% over the last few months. I think that because of this I tend to make most technical decisions, and act as BDFL today.

I would very much like to move away from this situation, but I think that we need to move away from both concentrated-decision-making and concentrated-maintenance at the same time. We've been doing this already using a federated/anarchy model. As people maintain sub-projects they gain full control over those projects and I step back as quickly as I can. For example...

So on the periphery of Dask there are many people who are both dedicated to the project and, I think, have decent control over parts of it. They make controversial technical decisions already as individuals or as groups, without interference from a central authority. These decisions include difficult topics like dropping Python 2, code styling, CI, and so on. Cross-project integration (like cluster deployment) seem to happen peer-to-peer.

So my current stance on moving away from BDFL is a combination of "yes please" and "central authority requires broad maintenance, which is hard to motivate". In the absence of multiple central maintainers I think that we want either a large diffuse group (majority voting), or a BDFL. I also think that we probably want to reduce the power of any central authority, and act more as an ecosystem like SciPy/PyData, with many small autonomous groups.

I also think that increasing the number of central maintainers would be useful. I suspect that the best chance of this short term would be for Anaconda to fill my previous maintenance spot there, probably with someone like Jim, who has touched a broad swath of the project in the past.

mrocklin commented 5 years ago

I'm totally open to push-back on my last comment by the way. That's the way I feel currently, but there are totally valid points in other directions, and I would be fine with them winning out.

guillaumeeb commented 5 years ago

It looks like a pragmatic point of view. If there is not enough people with a broad enough view of dask ecosystem, it's hard to set up.

But as @jcrist said, in my opinion we are just looking for a Steering committee / Leadership team for

a process here in case some decision becomes heated and the community is split

so in my opinion it shouldn't require broad maintenance necessarily.

pzwang commented 5 years ago

My personal preference is minimalism. I'm a big fan of network intelligence and swarm behavior. At this point, the Dask project/ecosystem is a community of participants that are aligned on shared technical goals. If we can maintain civil order and build a high-trust collaboration, then those values have some inertia to them, and we don't need to pre-optimize and build out a bunch of heavyweight governance.

However, a responsibility goes along with that, which is that we must all feel empowered and comfortable to raise a discussion when we feel things are not working out. To help ensure that remains the case, we need to set a community norm of transparency and interacting in good faith. I think that can go a long way, and as far as I can tell, that is the current standard of behavior within the Dask project.

So, my inclination is to punt on this question, and pick it up again if and when come to an actual point of contention.

[FWIW, one struggle that some OSS communities have is that they are philosophically antithetical towards the concept of companies employing people who work on projects. Apache, for instance, insists that "Committers are expected to participate in Apache projects as individuals, and not as representatives of any employers.". While this may have been useful for those projects and communities at one point, I think this is short-sighted and unrealistic. The onus does rest with devs to know when to explicitly state whether they are wearing their "personal hat" or "corporate hat", but this is part of "engaging in good faith".]

mrocklin commented 5 years ago

So, my inclination is to punt on this question, and pick it up again if and when come to an actual point of contention.

We do need some sort of governance structure for NumFOCUS sponsorship. I also think that it's probably better to have something on the books before an issue arises.

which is that we must all feel empowered and comfortable to raise a discussion when we feel things are not working out.

I entirely agree that anyone should feel empowered and comfortable to raise a discussion. I agree that this is important and that we can always improve here. I'm personally interested in how we make decisions during/after that discussion.

mrocklin commented 5 years ago

Some Dask folks met in Austin earlier this week. Some summary of the discussion.

I'm treating this issue now as a general conversation around governance, superceding #2. There is more conversation here, and it's pretty relevant.

For the NumFOCUS signatories, whose job it is mostly to distribute funds that might arise, we really just care about trustworthiness, and somewhat about diversity of domain. Some subset of the current owners is maybe a good idea.

For general project leadership we might consider three separate groups:

  1. Technical leadership group
  2. A group of special users that represent different domains (for example pangeo, or consulting companies)
  3. A group of industry partners who meaningfully support the project, either by employing developers to work on the project, or by providing funds directly (for example Anaconda, NVIDIA)

In the end, project decisions would come down to the first technical project leadership group, but the user and industry special interest groups would be consulted and their opinions heard before making any large decisions. The goal here is to encourage participation from a larger group of people, while keeping decision making within a group that is both trusted by the community, and willing to make difficult decisions.

Also, we discussed that this group will likely make many non-technical decisions. Some examples follow:

  1. A corporation arrives with money and asks for branding. For example Maybe Azure wants to fund a high-powered binder deployment for us, but wants a "Deployment powered by Azure" on the bottom of the page. Does Dask accept this or not accept this?
  2. An "official" Dask talk gets accepted to a conference, but a social issue arises with that conference. Does the Dask organization continue attending that conference or do we pull the talk.
  3. Someone contributes code to Dask after being paid by a company, but they weren't transparent about their employment or the reason for the contribution.
  4. ...
guillaumeeb commented 5 years ago

For general project leadership we might consider three separate groups

How do we form and manage these groups?

mrocklin commented 5 years ago

How do we form and manage these groups?

Carefully :)

I might interpret this question in two ways

How do we as a group assume authority to make these decisions?

When we switched to the set of eight owners in https://github.com/dask/dask/issues/3223 we got consensus from all previous owners and most people who had been active in the project. Currently in this process I think that we should get consensus from that group, others like yourself that maintain the various dask-foo packages, and representatives of corporations that are considering employing maintainers. Consensus among that group may not be possible, if so I think that I'm inclined to weight community maintainers over others.

So far consensus seems to be reasonable, but we'll see.

Logistically how do we choose these groups

For the leadership group I think that we should probably have another group meeting among maintainers over video to discuss things.

For the corporate partner group I suggest that we require that they employ about 1 full time employee who maintains OSS Dask projects. My guess is that Anaconda and NVIDIA are both close to this level, but probably not quite there when it comes to maintenance work. (maybe Anaconda is? not sure)

For the user group representatives I think that we want people who actively provide support to downstream user groups. So someone like @rabernat or @jhamman who don't do a ton of work on Dask itself, but end up filtering up Xarray or Pangeo user issues is useful. I think that the criteria here would be a certain amount of activity in other community discussion forums.

Membership in both of the latter groups would probably be decided by the leadership group.

pzwang commented 5 years ago

Thanks for writing this up, Matt. I think this accurately captures what we talked about.

mrocklin commented 5 years ago

I'm inclined to write up a governance document to make some of these discussions more concrete. I plan to start with something loosely inspired by the Pandas model (core team + BDFL for exception handling) and add in the groups above.

jakirkham commented 5 years ago

I’d be interested in hearing what caused you to shift your view point from the one you articulated earlier in this thread (namely being against a BDFL model), assuming I’ve read that correctly and haven’t missed something. It seemed there were also a few other people in this thread that preferred a core group that votes. This isn’t to say there may not be a good reason to use a BDFL model, but it would be good to articulate it at least.

mrocklin commented 5 years ago

Good question. Mostly it's what was expressed here: https://github.com/dask/governance/issues/5#issuecomment-450676767

And in particular in this comment:

I'm against BDFL in principle, but I also want to add the constraint that people who have central authority also participate consistently in central maintenance.

In February when I supported a core team, folks were supportive of taking on more maintenance. However I'm still handling the vast majority of issues/PRs, and when I go dark for a few days it's very common that issues/PRs go entirely unanswered.

So while everyone is awesome about maintaining various Dask subprojects from time to time, I don't get the sense that any other individual or institution feels full responsibility to make sure that the lights stay on. Until that happens I'd like for there to be a core team that makes most decisions, but where I have override privileges (which I suspect I will never use). If maintenance behaviors change I'd be all in favor of changing governance structures.

One of my main goals for this year is to construct a situation where I can step away for periods of time and still keep the lights on. So I'm actively invested in changing the situation above.

mrocklin commented 5 years ago

Draft added in #11

I'm totally happy to modify that heavily. I just wanted to have something concrete to work on.

I based this off of the Jupyter document (which is fairly similar to Pandas)

mrocklin commented 5 years ago

This was added in the governance document. Closing.