coecms / xmhw

Xarray version of Marine Heatwaves code by Eric Olivier
https://xmhw.readthedocs.io/en/latest/
Apache License 2.0
21 stars 10 forks source link

Dask compute number of workers #34

Closed sryan288 closed 2 years ago

sryan288 commented 3 years ago

Hi Paola,

I came across the issue where dask wants to use multiple threads but the system doesn't allow it (likely because of user restrictions on the server). It might be helpful to set an option to specifiy the number of workers that is then passed to the dask.compute() command. At least that solved the issue in my case.

Cheers, Svenja

paolap commented 3 years ago

Hi Svenja,

you should be able to do that outside of the actual code, when you configure a scheduler for disk

This page cover the distributed scheduler https://docs.dask.org/en/latest/setup/single-distributed.html I don't know if it can be done with the default scheduler, the distributed scheduler seems to be the preferred one. I'm not so keen to to add dark configuration to the functions themselves. There is not a way to do so easily as it depends how you run dask itself. But I will try to clarify a bit more how to use it of exclude dark in the demo notebook. Just never got time to get back to it. And while we're on topic, one of my colleagues prepared a training on running jobs in parallel, in case is of interest to any of you: https://coecms-training.github.io/parallel.

sryan288 commented 3 years ago

Oh yes, that makes of course much more sense to do that outside!

And thank you for the training link, that looks super helpful ;-)