Open ernimd opened 2 weeks ago
This also appears with open_mfdataset
, but appears to only happen with a threading scheduler.
In any case, the reason is that netcdf4
is not thread-safe anymore since netcdf4=1.6.1
(I think). Which means this has unfortunately been around for a while and is known (see e.g. #7079), but nobody had the time / skills / persistence to figure out what exactly causes this – race conditions are just that tricky to debug.
What happened?
I have a use case of computing a function in parallel with
dask
and inside the function I am opening/reading (no writing) the same.nc
dataset in all those parallel calls to do computation based on it. I am getting a bunch (see the log) of different errors (or none) while running it. Most likely it's not designed to be used in this way, or it's an actual bug in either IO implementation ordask
itself.Due to non-deterministic nature you should run the MVE with
while python crash.py; do :; done
What did you expect to happen?
I would expect it to fail gracefully, maybe warning about dangerous multi-thread reads.
Minimal Complete Verifiable Example
MVCE confirmation
Relevant log output
Anything else we need to know?
No response
Environment