dask / dask-blog

Dask development blog
https://blog.dask.org/
30 stars 35 forks source link

Blogpost idea: choosing good chunk sizes in Dask (turn this tweetorial into a blogpost) #116

Closed GenevieveBuckley closed 2 years ago

GenevieveBuckley commented 2 years ago

Ian mentioned this tweet to me today. I originally wrote it because I'd just given a tutorial, and lots of people were confused about how to choose good chunk sizes in Dask. Apparently that was helpful to a lot of people (this is supported by the twitter analytics stats, which are much higher than typical).

For better discoverability and permanence, it might be good to have these tips in blogpost format (twitter is a bit ephemeral, and searches don't often unearth content there).

jacobtomlinson commented 2 years ago

This would make a great post. I would refer folks to it often.

GenevieveBuckley commented 2 years ago

...you know what would ALSO be a good blogpost? How to choose good cluster settings. Eg: how your SLURM/PBS/whatever batch submission settings relate to the settings you need to put in your dask-jobqueue cluster object.

To be honest I'm still a bit confused by this, and it is something other people ask me too.

If either @jacobtomlinson or @ian-r-rose would like to help make this, that would be very useful to refer people to (hint, hint) :smile:

guillaumeeb commented 2 years ago

Hi all, I saw this issue, and I agree that both ideas would make great articles. Those are questions we see a lot as HPC admin/experts.

I can try to help with the second one one batch submission settings! Everyone is confused about it.

GenevieveBuckley commented 2 years ago

That would be very appreciated, @guillaumeeb. I'll make a separate issue for that idea, so we can discuss content, etc.