We want to be the front door for people who are having problems with Dask. A major way we will do this is by making people feel like their problems are being heard and addressed.
These are some notes on how to respond in various fora. I intend this to be a living document, so discussions about and updates to this issue are encouraged.
GitHub issues
There are a few buckets that GitHub issues fall into, and the way we respond should be different for each:
Issues opened by Dask maintainers about specific bug fixes or feature development - for the most part no action is required from the Dask community team.
Bug reports (or seeming bug reports) - Even if you don't know the answer to a bug report, there are a few things that are extremely valuable to do:
Say hi, acknowledge people's issue.
If they have not provided a minimal reproducible example, or the version numbers for the relevant packages, encourage them to do so (cf this discussion).
If they have provided an example, try to reproduce it yourself. If more than one person can reproduce the issue, that's good to know!
If you have an idea of how to fix the issue, go for it! If you don't have an idea, you might want to tag in someone with expertise in a particular region of the code base (see "Who should I ask for help" for more).
Usage questions - we want to encourage people to open usage questions in the Dask Discourse. If somebody opens a question that strikes you as a usage question rather than a bug report or feature request, consider redirecting them with something like the following copy:
Hi @user, thanks for opening this issue! It seems like this question is more of a usage question than a bug report or feature request. We encourage people with such questions to ask them at the Dask Discourse. Would you mind opening a discussion topic about this over there? If you do so, feel free to close this issue and we will continue the conversation on discourse.
After 1-2 days, you can close the issue and post a link to any follow-up discussion.
Discourse
This has the widest range of topics that are in scope. Indeed, a large part of the rationale for starting the Discourse is to capture discussions that were not appropriate for GitHub issues or Stack Overflow (see this issue for more discussion).
We should be welcoming and try to solve people's problems, have useful discussions, etc. It's also a good place to make announcements, job postings, etc. See the Orientation topic for more information.
Stack Overflow
Stack Overflow has some particular rules or asking and answering questions. We should try our best to conform to them.
Sometimes a Dask-related question asked on Stack Overflow is closed for being a poor question (e.g., it may be too opinion-based). We don't really need to be in the business of trying to get such questions closed, but if that happens, we may want to redirect the asker to Discourse. A comment like the following could be helpful:
Hi! Even though this question may not be appropriate for Stack Overflow,
It could be a good topic for discussion in the Dask Discourse. I encourage you to re-ask your question there.
Twitter
In addition to being a good platform for marketing/evangelism efforts, users will also post questions on Twitter. Pavithra Eswaramoorthy (@pavithraes) currently fields these questions.
Who should I ask for help?
Some useful names/GitHub handles. This is not exhaustive, and in many cases other people will have useful insights as well.
General Dask Maintenance - James Bourbeau (@jrbourbeau), Julia Signell (@jsignell).
High Level Graphs - Rick Zamora (@rjzamora)
Serialization - Mads Kristensen (@madsbk)
Comms - Jim Crist-Harif (@jcrist)
Distributed Scheduler - Florian Jetter (@fjetter)
DataFrame IO - Rick Zamora (@rjzamora)
Dask Ordering - Erik Welch (@eriknw)
Zarr - Martin Durant (@martindurant)
Fastparquet - Matrin Durant (@martindurant)
PyArrow - Joris Van den Bossche (@jorisvandenbossche)
Dask Dashboard - Naty Clementi (@ncclementi)
Deployment - Jacob Tomlinson (@jacobtomlinson), Guillaume Eynard-Bontemps (@guillaumeeb)
Documentation - Jacob Tomlinson (@jacobtomlinson), Julia Signell (@jsignell).
JupyterLab extension - Ian Rose (@ian-r-rose)
Xarray - Deepak Cherian (@dcherian)
Prioritization
The OSS issue firehose when you help maintain a popular project is difficult to handle! My current thinking on how to prioritize issues/topics/questions:
GitHub - highest priority, try to make sure people get some kind of response within 24 hours.
Discourse - second highest priority, especially as we are trying to establish it as a friendly and useful place to have discussion. Try to make sure people get some kind of response within a few days.
Stack Overflow - lowest priority, there are some other active people there who answer questions, and it should be okay to let things sit for a week or so.
The above is subject to change/discussion.
Refining people's questions
Often people's initial attempt at asking a question is not very helpful. They may be new to Dask, new to PyData, new to programming in general. They might be frustrated, they might not speak English as a first language. We should try to encourage people (in a friendly way) to make their questions more actionable. In general, this means a few things:
Provide a minimal reproducible example for the issue they are describing. Ideally this should include all the relevant imports, all extraneous details are removed, and it should be easy to copy-paste to verify the issue. There is an art to constructing these, and many newcomers have not learned it. For more on this, see this blog post.
If they are reporting a bug, provide version numbers for the relevant packages (often this means dask/distributed, it might also mean python, xarray,bokeh, etc).
If there is an error, provide the traceback for the error message. It's nice to include this in a GitHub flavored markdown with the python-traceback syntax highlighting. If the traceback is long, it is nice to have it in a <details></details> HTML tag.
If relevant, provide a screenshot (largely for dashboarding and graph visualization).
Issue Response
We want to be the front door for people who are having problems with Dask. A major way we will do this is by making people feel like their problems are being heard and addressed.
These are some notes on how to respond in various fora. I intend this to be a living document, so discussions about and updates to this issue are encouraged.
GitHub issues
There are a few buckets that GitHub issues fall into, and the way we respond should be different for each:
Usage questions - we want to encourage people to open usage questions in the Dask Discourse. If somebody opens a question that strikes you as a usage question rather than a bug report or feature request, consider redirecting them with something like the following copy:
After 1-2 days, you can close the issue and post a link to any follow-up discussion.
Discourse
This has the widest range of topics that are in scope. Indeed, a large part of the rationale for starting the Discourse is to capture discussions that were not appropriate for GitHub issues or Stack Overflow (see this issue for more discussion).
We should be welcoming and try to solve people's problems, have useful discussions, etc. It's also a good place to make announcements, job postings, etc. See the Orientation topic for more information.
Stack Overflow
Stack Overflow has some particular rules or asking and answering questions. We should try our best to conform to them.
Sometimes a Dask-related question asked on Stack Overflow is closed for being a poor question (e.g., it may be too opinion-based). We don't really need to be in the business of trying to get such questions closed, but if that happens, we may want to redirect the asker to Discourse. A comment like the following could be helpful:
Twitter
In addition to being a good platform for marketing/evangelism efforts, users will also post questions on Twitter. Pavithra Eswaramoorthy (
@pavithraes
) currently fields these questions.Who should I ask for help?
Some useful names/GitHub handles. This is not exhaustive, and in many cases other people will have useful insights as well.
@jrbourbeau
), Julia Signell (@jsignell
).@rjzamora
)@madsbk
)@jcrist
)@fjetter
)@rjzamora
)@eriknw
)@martindurant
)@martindurant
)@jorisvandenbossche
)@ncclementi
)@jacobtomlinson
), Guillaume Eynard-Bontemps (@guillaumeeb
)@jacobtomlinson
), Julia Signell (@jsignell
).@ian-r-rose
)@dcherian
)Prioritization
The OSS issue firehose when you help maintain a popular project is difficult to handle! My current thinking on how to prioritize issues/topics/questions:
The above is subject to change/discussion.
Refining people's questions
Often people's initial attempt at asking a question is not very helpful. They may be new to Dask, new to PyData, new to programming in general. They might be frustrated, they might not speak English as a first language. We should try to encourage people (in a friendly way) to make their questions more actionable. In general, this means a few things:
dask
/distributed
, it might also meanpython
,xarray,
bokeh
, etc).python-traceback
syntax highlighting. If the traceback is long, it is nice to have it in a<details></details>
HTML tag.