hpcflow / hpcflow-new

Mozilla Public License 2.0
0 stars 5 forks source link

Excessive metadata array reads in `Workflow.write_commands` #667

Open aplowman opened 5 months ago

aplowman commented 5 months ago

When checking if we need to add a loop termination command to the commands of an action, we call Workflow.get_iteration_final_run_IDs, which in turn calls Workflow.get_loop_map. This then calls Workflow.get_EARs_from_IDs (which reads the runs metadata array) on all run IDs from that submission, which could be many thousands of runs for large workflows.

In principle, this shouldn't be a problem because Zarr support multiprocess reading. In practice, it seems something is going wrong here under high concurrency scenarios (i.e. using a large job array when the cluster has very good availability). We get random RuntimeErrors from numcodecs during the chunk decompression from this metadata array. These errors are guarded against using the reretry package. However, for tasks that should be quick, this introduces a potentially lengthy delay to execution, especially for large workflows.

Additionally, reading the whole array is slow on Lustre file systems in general, because this array must be single-chunked (one chunk/file per run) to allow for multi-process writing during execution. So we ideally want to avoid reading most of/the whole array anyway.

Two steps to solve:

aplowman commented 5 months ago

First step fixed in https://github.com/hpcflow/hpcflow-new/pull/668.