Phase concurrent - Githubissues

JulianKunkel commented 2 years ago

For the sake of discussion, added a phase where multiple benchmarks are executed. It requires at least 5 procs and executes: benchmark 0 - 20% procs - parallel write - ior easy benchmark 1 - 40% procs - parallel rnd1MB read benchmark 2 - 40% procs - md-workbench for concurrent usage It does provoke errors in the md-workbench cleanup phase and extra output. These are not score-relevant, though. The score is computed based on the geo-mean of the individual procs, weighted by proc counts involved in the individual benchmark. It is only included in the extended mode run.

JulianKunkel commented 2 years ago

I believe a subset of the patch is useful as this prepares the parallel execution and anyone can use it for creating a meaningful parallel benchmark from IO500. Here btw. is an example output with score:

[concurrent] exe-easy-write = ./ior --dataPacketType=timestamp -C -Q 1 -g -G 1836270349 -k -e -o ./datafiles/ior-easy/ior_file_easy -t 2m -b 9920000m -F -w -D 1 -a POSIX -O saveRankPerformanceDetailsCSV=./results/concurrent-ior-easy-write.csv exe-rnd1MB-read = ./ior --dataPacketType=timestamp -Q=1 -g -G=-1368305808 -z --random-offset-seed=11 -e -o=./datafiles/ior-rnd1MB/file -O stoneWallingStatusFile=./results/ior-rnd1MB.stonewall -k -t=1048576 -b=1073741824 -s=10000000 -r -R -a POSIX -O saveRankPerformanceDetailsCSV=./results/concurrent-ior-rnd1MB-read.csv exe-md-workbench = ./md-workbench --dataPacketType=timestamp --process-reports -a POSIX -o=./datafiles/mdworkbench -t=0.000000 -O=1 --run-info-file=./results/mdworkbench.status -D=10 -G=413508310 -P=5027 -I=5027 -R=1 -X -w=1 -o=./datafiles/mdworkbench --run-info-file=./results/mdworkbench.status -2 score-ior-easy-write = 0.688013 score-ior-rnd1MB-read = 7.019916 score-ior-md-workbench = 132.352509 score = 7.996812

Note that it was run with 5 procs, hence, the overall score is calculated with weighting as follows: ((0.6880131 7.0199162 132.352509 * 2) / 5)^0.33333

Increasing proc counts will first increase for the last benchmark, then the second, then the first. With 8 procs, one should have 1, 3, 4 procs in the individual benchmarks.

adilger commented 2 years ago

Isn't md-workbench itself already a concurrent IO workload? I think with the built-in workloads of IOR and mdtest plus a hard stonewall timer it would be possible to generate arbitrary small/large/random read/write + create/stat/find/unlink workloads as needed.

It definitely has some interesting potential as both a stress test and as a way of measuring the over all capabilities of the storage system for more "real world" production workloads where there are dozens of jobs doing uncoordinated IO.

"It requires at least 5 procs and executes: 20% procs - parallel write, 40% procs - parallel rnd1MB read, 40% procs - md-workbench"

This ratio should definitely be configurable, at least during testing, even if we eventually require a specific ratio for submission. There would likely also need to be some coordination between the workloads (i.e. it isn't possible to read from files that haven't been written yet), so there may need to be an unmeasured "warmup" time, or possibly this is still included, but the read workload cannot start until some fraction of the runtime has elapsed (e.g. 25%) to create/write the files.

JulianKunkel commented 2 years ago

The purpose is to simulate a "used system" where some nodes run a parallel write, others a random read and users work interactively.

Note that here there is no coordination between workloads in this phase necessary. It simply runs three benchmarks at the same time. All of them use the artifacts created before. Indeed using MD-workbench removes the need to run even more benchmarks such as mdtest create, delete, ... as well. Only one Metadata benchmark needs to be run now and it synchronizes itself.

We will evaluate the influence of this on an isolated system. The configurable ratio for benchmarks is a good extension but tricky to assign the ranks. Maybe one could have to set how many out of 10 procs run a certain benchmark.

Missing features:

proper cleanup of metadata (doesn't harm score though)
configurable ratio

IO500 / io500

Phase concurrent #56