Opened this issue to track the situation of IOR hard that takes many hours when running on spinning disks.
On a Lustre system such as DKRZ and Archer 2 that has spinning disks, the IOR hard takes basically unbearable long.
Inspecting ior-hard-write.txt for a test with
[debug]
stonewall-time = 1
For example on 10 nodes with 8 procs each leads to:
stonewalling pairs accessed min: 21 max: 3195 -- min data: 0.0 GiB mean data: 0.0 GiB time: 1.2s
The overall runtime here was then 16.5s.
The imbalance on 5 minute runs can stretch the runtime further, causing the hard phase to take many hours (often to be killed by 8 hour deadlines).
The only suitable workaround is to reduce the segment count to a bearable number, e.g.,
[ior-hard]
segmentCount = 10000
Opened this issue to track the situation of IOR hard that takes many hours when running on spinning disks. On a Lustre system such as DKRZ and Archer 2 that has spinning disks, the IOR hard takes basically unbearable long.
Inspecting ior-hard-write.txt for a test with [debug] stonewall-time = 1
For example on 10 nodes with 8 procs each leads to: stonewalling pairs accessed min: 21 max: 3195 -- min data: 0.0 GiB mean data: 0.0 GiB time: 1.2s The overall runtime here was then 16.5s.
The imbalance on 5 minute runs can stretch the runtime further, causing the hard phase to take many hours (often to be killed by 8 hour deadlines).
The only suitable workaround is to reduce the segment count to a bearable number, e.g., [ior-hard] segmentCount = 10000