Error processing large datasets (long simulations), which should generally be managable

The sub-column generator becomes "stuck"and continuously producing the following error: Traceback (most recent call last): File "/home/meteo/ixs34/.conda/envs/emc2/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/home/meteo/ixs34/.conda/envs/emc2/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/meteo/ixs34/.conda/envs/emc2/lib/python3.7/multiprocessing/pool.py", line 110, in worker task = get() File "/home/meteo/ixs34/.conda/envs/emc2/lib/python3.7/multiprocessing/queues.py", line 354, in get return _ForkingPickler.loads(res) _pickle.UnpicklingError: invalid load key, '\xff'.

In the case I tried to process with EMC2, the model output file had the time dimension consisting of 17,500 samples (requested 10 subcolumns). A quick calculation shows that the output field should generally be manageable (~84 MB per subcolumn x level x time dimension double field). I suspect that an internal option for running the parallel processing in chunks might do the trick.

columncolab / EMC2

Error processing large datasets (long simulations), which should generally be managable #32