Closed mrocklin closed 5 years ago
@jhamman can probably amend or append to this...
In my mind, these are the 5 best things that we get from Dask, here at NCAR:
I probably need some help developing this list, as this list is short and very self-centered. In fact, some of these things could actually already have solutions...which I would be happy to hear about.
mpirun
or mpiexec
. While Dask is fantastic for interactive jobs with Jupyter Lab/Notebooks, a great deal of our workflows at NCAR are still (and probably will be for the foreseeable future) chained batch-job workflows. To fit into those workflows (without extra modifications), a single batch job must be submitted that launches the Scheduler, Workers, and the Client (and application, obviously) together....@jhamman, is there anything you would add?
Thanks for getting this started @kmpaul !
Oh, and I just was reminded of another "Thing That Could Be Better"...
My contribution, just some quick thoughts:
Also 👍 to Scheduler Profiling from @kmpaul, but 👎 on Batch Launching that is already covered by Dask IMO
And I'm also very interested in contributing into the blog post.
There may also be ideas to take in @jhamman post that you already relayed on dask-blog: https://blog.dask.org/2018/10/08/Dask-Jobqueue.
@guillaumeeb I'd be interested to hear how you solve the Batch Launching problem. And, if you feel it is a solved problem, I'm obviously happy to take it off the list (and learn something myself!).
Maybe I'm missunderstanding your point, but isn't dask-mpi just there for batch launching dask applications? Happy to discuss on gitter about this in order to not pollute this issue.
Sure. Let's move to gitter.
@kmpaul just convinced me that dask-mpi
was not yet sufficient to implement correctly batch launching, so in the end 👍 to improved batch launching too!
I'm happy to start writing up a skeleton draft of this if people are interested.
Alternatively, it would still be good to get thoughts from others. I wonder if now that AGU is over folks like @rabernat or @jhamman have time. I'm also interested in thoughts from non-earth-scientists like @lesteve and @ogrisel if they have time to list some general thoughts.
ping also @willirath and @apatlpo.
Apologies on the delayed response. @guillaumeeb list is actually pretty spot on. Very interested on the heterogeneous resource launching and improved batch launching.
(In addition to virtually all of the above)
Easy to teach: When training people in using Dask, it's very easy to expose them to exactly the fraction of the API that is necessary for the task at hand.
Heterogeneous clusters: Both, making them easier to launch and having a simple way of associating built-in methods of Dask arrays with resources would be great.
I've added a quick draft here: https://github.com/dask/dask-blog/pull/6
Help filling things in there would be welcome.
Hi All,
I'd like for a group of us to write a blogpost about using Dask on supercomputers, including why we like it today, and highlighting improvements that could be done in the near future to improve usability. My goal for this post is to show it around to various HPC groups, and to show it to my employer to motivate work in this area. I think that now is a good time for this community to have some impact by sharing its recent experience.
cc'ing some notable users today @guillaumeeb @jhamman @kmpaul @lesteve @dharhas @josephhardinee @jakirkham
To start conversation, if we were to structure the post as five reasons we use Dask on HPC and five things that could be better, what would be those five things? I think it'd be good to get a five-item list from a few people cc'ed above, then maybe we talk about those lists and I (or anyone else if interested) composes an initial draft that we can then all iterate on?