carpentries-lab / python-aos-lesson

Python for Atmosphere and Ocean Scientists
https://carpentries-lab.github.io/python-aos-lesson/
Other
88 stars 50 forks source link

Add content on Dask task graph and debugging #41

Open DamienIrving opened 3 years ago

DamienIrving commented 3 years ago

At my 2021 Dask Summit presentation about teaching Dask to atmosphere and ocean scientists it was suggested that content could be added about the Dask task graph and debugging / best practices for finding pain points.

It was suggested that this PyData talk might be useful: https://www.youtube.com/watch?v=JoK8V2eWFPE

DamienIrving commented 2 years ago

On the debugging side of things, it would be worth adding the progress bar to the lesson:

import dask.diagnostics
dask.diagnostics.ProgressBar().register()

In order to do this we'd need to explain the difference between a local (or single-machine; default) scheduler and a distributed scheduler, because the tools you use for profiling are different for each. I think this distinction is well worth explaining.
https://docs.dask.org/en/stable/diagnostics-local.html
https://docs.dask.org/en/stable/scheduling.html

This script also shows how to use the resource profiler: https://github.com/climate-resilient-enterprise/workflows/blob/master/cmdline_programs/return_period.py