dask / community

For general discussion and community planning. Discussion issues welcome.
20 stars 3 forks source link

Criteria for an institution to be considered a supporter of Dask #44

Open jcrist opened 4 years ago

jcrist commented 4 years ago

This is related to #32, but smaller in scope.

We currently have a group of contributors who meet weekly to discuss maintenance of Dask. To be a part of this group, we ask that you have 8+ hours a week to spend working on general maintenance and "core" tasks - things that are generally useful, rather than your own specific use cases. This team is currently composed of a people from several different companies, with this "dask core work" being part of their job description (i.e. their employer is on board with this arrangement, and is effectively funding dask development).

A list of companies supporting dask is displayed at the bottom of our main page (https://dask.org/). This list includes some institutions that have funded Dask (currently, as well as previously), as well as a few (but not all) of the companies currently employing the core Dask team.

A question then arises - how does an institution get added to this list, specifically with regards to funding a developer to help handle core maintenance tasks?

I don't believe we've codified this in our governance anywhere. Colloquially, we require:

Assuming we like the above requirements, the missing components are:

This is a tricky question. We want to reward good actors early, to encourage good behavior. But we don't want a bad actor to be able to game the system - funding a dev for just long enough to get status and then stop. We also want a smooth process for removing institutions if the requirements are broken, while also acknowledging that gaps in support will likely occur because people take vacations/have off weeks.

jcrist commented 4 years ago

I think we should err on the side of giving recognition early - we want to encourage good behavior, and I believe there are more good actors than bad actors.

Throwing out a proposal just to get discussion going:


Note that this is a bit self-serving. I've been working on Dask for a few years now, but have recently changed employers (from Anaconda to Prefect). We'd like to be acknowledged for helping fund development, since Dask maintenance will still be part of my job description.

jcrist commented 4 years ago

cc-ing all those currently participating in the Tuesday meetings (apologies if I missed anyone):

@mrocklin, @TomAugspurger, @martindurant, @jjhelmus, @jrbourbeau, @quasiben, @jsignell, @gforsyth

gforsyth commented 4 years ago
  • A vote by members of the core dev team (not actually sure if this is a good idea, but it's an option)

This could be handled at the monthly meetings in the same way as new additions to the dask org on github, more of an "any objections" voice vote

jsignell commented 4 years ago

I think that proposal sounds reasonable. I was wondering if there could be a way to acknowledge historic contributions without making this too much of a burden. Would it make sense to have companies who contributed by year or is that over-doing it? The idea would be that the change of the year would trigger an evaluation of which companies are still contributing and if any have lapsed.

Alternatively, we could add a standing agenda item to the monthly meetings where we check that the existing supporting institutions haven't lapsed and see if there are any new ones to add. Maybe the first meeting that an institution is mentioned, it can be put in a staging area, and then if it is in good standing the next month it can be added to the list.

mrocklin commented 4 years ago

We'd like to be acknowledged for helping fund development, since Dask maintenance will still be part of my job description.

I love that companies want this recognition now :)

Some fail cases to think about:

  1. Company donates the time of someone who isn't actually that productive, or a sequence of junior folks in order to have us train them
  2. Individual shows up to the meeting regularly, but doesn't quite get around to doing work. This is actually pretty normal today. Many of us say "yeah, I didn't get around to anything this week, sorry". This is totally ok, but maybe it becomes less ok if it's habitual and is something that we're learning that companies value.
jcrist commented 4 years ago

Commenting again to re-ping all those cc'd above. We'll be having our monthly meeting this Thursday, it would be good to get thoughts down here beforehand (or come prepared with some thoughts) so we can discuss (and hopefully) resolve this then.

mrocklin commented 4 years ago

Following on to what I said above, I think that we might ask a company to submit a brief form detailing the contributions that their engineers have provided that are beyond their own objectives. This would help defend against the practice of sending in junior devs, and create some accountability on their end to make sure that their people do solid work.

For new people who have come on maybe we start tracking contributions a couple months after they start, just to account for on-ramp cost.

Ideally this form isn't onerous, something like one page detailing problems resolved, a demonstration that they're beyond the company's specific objectives, and a demonstrated commitment to keep contributing. This would then be reviewed by org owners.

jcrist commented 4 years ago

For new people who have come on maybe we start tracking contributions a couple months after they start, just to account for on-ramp cost.

I think this is too gate-keepery (in terms of waiting months rather than weeks). We want to ensure that devs are being effective, but we also want to reward good behavior. I'd rather err on the side of giving too much credit than giving too little. Giving credit costs us nothing, and can help incentivize and reward good contributors early (and make them feel on the same "level" as the rest of the community).

martindurant commented 4 years ago

I believe it goes without saying, but maybe worth writing down: that the employee(s) in question must already have qualified for at least dask org membership by the normal process. Is there a document for how to get invited to core dev meetings as an individual?

mrocklin commented 4 years ago

Putting on my corporation hat, a couple months feels like a very short time. If a company is turned off by this then I think they're probably not thinking about maintenance activities at the right time scale.

Let's say that Amazon wants their logo on the Dask page. They send us a junior dev for a couple of months, and senior devs spend time helping that dev work through some problems. Amazon a few weeks later Amazon says "look, our dev did things, can we have a logo presence now?"

My answer is "no, while it's true that your dev did things, they really only did those things because we were helping them. You're welcome for bringing your dev up to speed. Now that that dev has a bit more experience let's see how effective they are at solving problems without hand-holding for a few months.

In my experience corporations think about assigning devs to problems on multi-month timescales.

jcrist commented 4 years ago

I think if a dev:

then we should be happy to note that company as contributing to Dask development quickly.

Perhaps commit rights and participation in the weekly dev cycle are the proper gates here? Commit rights indicate some level of community/org trust, and we already have a process of asking the existing community beforehand before giving them out.

mrocklin commented 4 years ago

Yeah, I have no problem with waiving the lead-in time for long-term engineers. I'm mostly concerned with making sure that we're valuing contributions, rather than dev-hours.

quasiben commented 4 years ago

Some of what is laid out here seems similar to Jupyter's Institutional Partners. To be an Institutional Partner:

Institutional Partners are organizations that support the project by employing Jupyter Steering Council members.

And Jupyter has also laid out what the Steering Council is and how to add and remove members. One thing I appreciated in how to add members were the following lines:

... a comprehensive view of their contributions. This will include but is not limited to code, code review, infrastructure work, mailing list and chat participation, community help/building, education and outreach, design work, etc.

Some of these ideas are expressed in Dask's membership doc though not to the point of what it means to be an institutional sponsor of Dask.

One last thing to note about Jupyter's ecosystem is that they have three separate sections for logos:

jcrist commented 4 years ago

This was discussed a bit during the monthly meeting this morning. Summary:

mrocklin commented 4 years ago

feels a bit weird to be reporting on myself

I understand that this might feel weird. I do think it's totally normal though. People report what they do all the time. People often do this to their managers, for example. This is often the purpose of a daily report. Providing some level of accountability shifts people's thinking a bit towards acheivement rather than participation.

A report should be easy to compile. You can use the Github web UI to collect a set of PRs that you resolved as part of maintenance activity, and a set of issues on which you were the primary reviewer. That, attached with a bit of prose documenting the rough areas of specialization should take no more than an hour or so to put together.

A few others (at least Martin and myself) noted that the number of cases where this has come up are few currently, can probably be handled without much process - a simple vote by existing "core" members may be sufficient.

I agree that this is the right technical mechanism. However I still think that we'll need to figure out how we as a group make this decision.

mrocklin commented 4 years ago

Also, reporting on that meeting @mmccarty mentioned at the end that from a corporate perspective it would be good to have a set of expectations so that they know what they're signing on for.

jsignell commented 4 years ago

I like thinking about the balance between participation and achievement (thanks for that @mrocklin) and it seems clear that a good maintainer needs both. An example of imbalanced behavior is someone who shows up to meetings but doesn't get much done or someone who doesn't come to any meetings or engage on issues, then opens a massive PR.

The challenge (maybe) is that it is easier to see participation and harder to see achievement. I like the idea of forming a set of expectations that could both show companies what they are signing up for, and also be used as a template for reporting.

For instance, it could be:

All of these (participation):

Several of these (achievement):

mrocklin commented 4 years ago

Personally I see triaging issues as achievement. Personally I don't actually care about participation at all.

A good example here is sharks, who was a contributor several years ago. I think that due to language barriers he didn't participate socially, but was awesome at fixing every dask dataframe that he could find. I would absolutely give him and his employer any desired acknowledgement, even though he didn't track hours or attend meetings.

On Fri, May 8, 2020 at 7:41 AM Julia Signell notifications@github.com wrote:

I like thinking about the balance between participation and achievement (thanks for that @mrocklin https://github.com/mrocklin) and it seems clear that a good maintainer needs both. An example of imbalanced behavior is someone who shows up to meetings but doesn't get much done or someone who doesn't come to any meetings or engage on issues, then opens a massive PR.

The challenge (maybe) is that it is easier to see participation and harder to see achievement. I like the idea of forming a set of expectations that could both show companies what they are signing up for, and also be used as a template for reporting.

For instance, it could be:

All of these (participation):

  • attend Tuesday meetings
  • czar once a month
  • contribute at least 8 hours per week of developer time
  • triage issues

Several of these (achievement):

  • review PRs
  • contribute code
  • write docs ...

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dask/community/issues/44#issuecomment-625848373, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACKZTE5IALUXPH27HBVC7LRQQKYDANCNFSM4MOGNH5Q .

mrocklin commented 4 years ago

Sorry, autocomplete. sharks -> sinhrks

On Sat, May 9, 2020 at 8:08 AM Matthew Rocklin mrocklin@gmail.com wrote:

Personally I see triaging issues as achievement. Personally I don't actually care about participation at all.

A good example here is sharks, who was a contributor several years ago. I think that due to language barriers he didn't participate socially, but was awesome at fixing every dask dataframe that he could find. I would absolutely give him and his employer any desired acknowledgement, even though he didn't track hours or attend meetings.

On Fri, May 8, 2020 at 7:41 AM Julia Signell notifications@github.com wrote:

I like thinking about the balance between participation and achievement (thanks for that @mrocklin https://github.com/mrocklin) and it seems clear that a good maintainer needs both. An example of imbalanced behavior is someone who shows up to meetings but doesn't get much done or someone who doesn't come to any meetings or engage on issues, then opens a massive PR.

The challenge (maybe) is that it is easier to see participation and harder to see achievement. I like the idea of forming a set of expectations that could both show companies what they are signing up for, and also be used as a template for reporting.

For instance, it could be:

All of these (participation):

  • attend Tuesday meetings
  • czar once a month
  • contribute at least 8 hours per week of developer time
  • triage issues

Several of these (achievement):

  • review PRs
  • contribute code
  • write docs ...

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dask/community/issues/44#issuecomment-625848373, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACKZTE5IALUXPH27HBVC7LRQQKYDANCNFSM4MOGNH5Q .