Open mrocklin opened 6 years ago
I have no idea how we'd work dask meaningfully into a packaging tutorial, but if you have ideas on that, I'd be open to discussion.
I agree that this may not make sense for many of the projects here, including packaging.
If you'd like to open a discussion then I recommend raising a separate issue on the issue tracker of this repository (just to avoid spamming the others here).
If you're fairly confident that there isn't much opportunity here then please feel free to ignore this entirely.
Thanks!
cc @jasongrout.
In the JupyterLab tutorial we can show what the dask lab extension does (and that it is an extension).
I'm not sure how much time I'll have to work that into our tutorial. I think the most interesting application might be parallelizing on a single machine, I'm not sure how well that works with your setup? What's the status on the broad-casting of data for doing a random forest in parallel?
Folks, for the sake of everyone cc'ed on this issue please do not respond to this issue.
Please respond in new issues.
@amueller I've responded to your comment in https://github.com/dask/scipy-tutorials-2018/issues/7
Thank you @jasongrout for responding to @Carreau in #4
@mrocklin - you might even lock this thread, and put a note to that effect in the top-level description.
tl;dr: do you want to collaborate on a small scalability section using Dask within your SciPy tutorial?
Hello everyone,
I'm excited about the SciPy tutorial lineup this year. Dask devs plan to set up some infrastructure to give students in our tutorial access to a modest cluster in the cloud so that they can do some scalable analysis. These distributed systems have been popular in our tutorials in the past. It will likely be similar to the public pangeo deployment currently used within earth sciences (JupyterHub + Dask on Kubernetes on Google). We were also planning to extend this infrastructure to a couple other groups (scikit-learn, pandas, ...) for small dask sections at the end of their tutorial but, after seeing the lineup this year, thought it might be best to reach out to others to see if a broader collaboration might be more interesting.
So, concretely, do you want to collaborate on a small section in your tutorial that shows how to scale your domain and libraries using Dask? This would require the following from you:
Dask and JupyterHub developers have some availability to help assist tutorial leaders develop these materials and manage infrastructure so that students have access to distributed resources during tutorials.
What Next?
If you're interested in exploring this topic then please raise a new issue within this repository with the title of your tutorial and some thoughts and questions (there are some leading questions within the issue template to help start conversation). I imagine that most people haven't started writing or updating materials yet, so I would expect early conversation to be pretty exploratory. Perhaps we can explore applications together that might be both interesting and accessible to beginning students.
General questions are also welcome here, though please note that many people are cc'ed on this issue, and so raising new issues within this repository might be best to avoid all-to-all e-mail chatter.
Who
To avoid exclusion I've included the top author listed on all tutorials. However I expect that this will make more sense in some cases (introduction to numpy) than in others (introduction to Julia) but I would love to be surprised :) To those for whom this is not a good fit I sincerely apologize for the unnecessary e-mail. You may wish to unsubscribe from this issue.
Also cc @yuvipanda, @choldgraf, and @willingc from JupyterHub
Thank you all for your time, -matt