I sometimes have multiple processes or multiple machines accessing the same storage cluster that hosts the cache directory of ir_datasets. If multiple processes decide to download the same dataset at the same time, they start writing to the same file and eventually crash.
It would be nice if there is a locking mechanism that prevents more than one process from writing to the same file and asking other processes to wait.
I sometimes have multiple processes or multiple machines accessing the same storage cluster that hosts the cache directory of
ir_datasets
. If multiple processes decide to download the same dataset at the same time, they start writing to the same file and eventually crash. It would be nice if there is a locking mechanism that prevents more than one process from writing to the same file and asking other processes to wait.