Open malininae opened 7 months ago
It looks like that recipe is using a preprocessor function that is not yet lazy (see https://github.com/ESMValGroup/ESMValCore/issues/674 and https://github.com/SciTools/iris/pull/5795), therefore your best bet is the default threaded scheduler (this can be done by removing the file ~/.esmvaltool/dask.yml). If you run out of memory, try reducing the number of workers (num_workers
) in your Dask configuration. This can e.g. be done by creating a file ~/.config/dask/dask.yml
with the content num_workers: 4
. See the Dask docs for more information.
I can see the value of adding the optional configurable dask.yml path to the config_user.yml, sort of like config_developer.yml is set up. Any objections?
Improving how users can configure Dask is indeed something we would like to do, see https://github.com/ESMValGroup/ESMValCore/issues/2040 for previous discussion on the topic. However, the plan is to tackle this as part of a larger overhaul of the configuration, as users often find the current way of configuring things, where settings are spread out over multiple files, confusing. See https://github.com/ESMValGroup/ESMValCore/issues/2371 for the plan.
I'd like to encourage the option of having a configurable the dask configuration.
Ideally it would be easy to select a specific dask configuration (or no configuration) for individual recipes, just as you can choose a specific config_user
file.
At the moment, if one of my recipes does not play nice with dask, it's too much effort to interrupt my workflow to move/rename the ~/.esmvalcore/dask.yml
file. For example, I'd have to run all recipes in groups according to their required dask configuration. It's faster to run everything in parallel without dask! This is a shame given the effort to implement dask within ESMValTool.
While I appreciate that there is an effort to make this happen (as part of https://github.com/ESMValGroup/ESMValCore/issues/2371, as I understand it), I think there could be an immediate benefit to resolving this issue.
Hi @k-a-webb, I fully agree with you, making dask configurable per recipe would be a very convenient feature!
There is currently a pull request in review that needs to be merged before this issue here can be tackled (https://github.com/ESMValGroup/ESMValCore/pull/2448). Would you be willing to have a look on it from a user's perspective? We already had a round of technical reviews, so we are mainly interested in the user-friendliness of the proposed approach.
That would be very helpful and speed up the process. Thank you! 👍
Hi, apologies for the delayed response. I spent a day muddling through trying to install and test the new configuration, but ran into a few hiccups. I then went on vacation, and have not had time to make progress upon my return. I hope that my inability to do a dedicated test of the pull request does not cause a delay in its progress. If I find the time to document my issues, or make further progress on the test, I will be in touch.
Thanks @k-a-webb for looking into this and also sorry for the late answer (I just returned from vacation). Is there anything we can help with regarding installation? To test this, you would need an installation from source of ESMValTool and ESMVCalCore.
Our plan was to merge this by the end of September, but if you still want to test this and need more time, that's also no problem at all. Thanks!!
@Karen-A-Garcia and I are operationalizing ESMValTool and running a few recipes sequentially. Although our dask expertise are quite limited, we've got a problem that this recipe doesn't seem to work with any dask setup we've tried, while the other recipes do work. We'll try to figure out what the deal is, but, I can see the value of adding the optional configurable
dask.yml
path to theconfig_user.yml
, sort of likeconfig_developer.yml
is set up. Any objections? If someone could volunteer themselves, great, if not I can do it in a dream land called 'after April 15th'.