Open bnlawrence opened 6 months ago
Hi Bryan,
A bit of digging suggests that this is a bug (https://github.com/pydata/xarray/issues/1464 has the details). However, the writing is locked anyway (a netCDF4-python restriction), so there shouldn't be any benefit in this case from running on 12 workers.
If you remove the dask.config.set(...)
line, I suspect that it will work.
I shall make the fix, though, so that your original code works doesn't fail.
I shall make the fix, though, so that your original code works doesn't fail.
Looking into how xarray
deals with this (which I haven't wholly understood, yet), it's probably not the 5 minute fix I dreamt of, but I'll keep at it ...
(Sorry, I was hoping that I would get benefit from the workers on the read, since the pp bit is slow)
OK - we can read PP/FF files in parallel, so if you did (ff[0] + 2).array
the reads would be parallised over Dask chunks, but writing is limited to one Dask chunk at a time, and a Dask chunk equates to one 2-d UM field, and so no benefit from parallelism in the writing case :(
Attempt to use cf-python to read pp and write some netcdf. Code is:
Platform is jasmin sci6, data is N1280 pp output.
Error log here