Closed milroy closed 1 month ago
Nice!
Just a few quick comments about testing job submission and scheduling:
time for i in {1..100}; do flux submit -N 16 -n 64 hostname; done real 0m8.184s
Submitting jobs with flux submit
back-to-back like this may not be fully exercising the system. This is because the time to submit a single job here is bounded by the time to run flux submit
, not necessarily the actual submission RPC. A better test of the impact of these changes might be to run a throughput test that tests how long it takes the scheduler to schedule all the submitted jobs. To do that, use the --cc
option. For example on a size=1 instance on quartz:
$ time for i in {1..100}; do flux submit -N 16 -n 64 hostname; done
real 0m34.698s
$ time flux submit --cc=1-100 -N16 -n 64 hostname
real 0m0.847s
You can see that submitting multiple jobs at once with --cc
is a few orders of magnitude faster than flux submit
in a loop. This is not only because we don't have to pay the half second penalty for starting python
for each job, but also because the submission RPCs are sent asynchronously.
If you add --wait-event=alloc
then the command will wait until all jobs have been scheduled, you can get the time for the scheduler to schedule all jobs. You could also add --setattr=exec.test.run_duration=10s
to bypass the execution system so that jobs actually stay in RUN state instead of getting an exception since all the fake resources don't really exist.
Using these together might give us a better idea of the full impact of these changes, i.e. I imagine things will look much better...
Just a few quick comments about testing job submission and scheduling:
@grondo I now realize you've explained that to me before but I didn't think of it when I was running the initial test. Thanks for the additional details so I understand better!
I reran the tests above with the same setup using your suggestions, and the improvements from the change in this PR are uniform and better than I reported above. Note that I ran 1000 jobs with feasibility disabled because the tests ran so fast:
Feasibility enabled (100 jobs):
time flux submit --cc=1-100 -N16 -n 64 --requires="hosts:test[16001-16016]" hostname
real 0m33.823s
Feasibility disabled (1000 jobs):
time flux submit --cc=1-1000 -N16 -n 64 --requires="hosts:test[16001-16016]" hostname
real 0m1.382s
Feasibility enabled (100 jobs):
time flux submit --cc=1-100 -N16 -n 64 hostname
real 0m1.896s
Feasibility disabled (1000 jobs):
time flux submit --cc=1-1000 -N16 -n 64 hostname
real 0m0.395s
Feasibility enabled (100 jobs):
time flux submit --cc=1-100 -N16 -n 64 --requires="hosts:test[16001-16016]" hostname
real 0m8.807s
Feasibility disabled (1000 jobs):
time flux submit --cc=1-1000 -N16 -n 64 --requires="hosts:test[16001-16016]" hostname
real 0m1.372s
Feasibility enabled (100 jobs):
time flux submit --cc=1-100 -N16 -n 64 hostname
real 0m1.873s
Feasibility disabled (1000 jobs):
time flux submit --cc=1-1000 -N16 -n 64 hostname
real 0m0.379s
That's a throughput improvement of almost 4x for 100 jobs with feasibility checking.
Thanks for the reviews! Setting MWP.
Merging #1162 (b3fbafe) into master (6e6576d) will increase coverage by
0.0%
. The diff coverage is100.0%
.
Matching jobspecs with node constraints is currently implemented in such a way that the recursive
dom_*
call occurs before constraint checking.This PR moves the node constraint check within the prune function to prevent unnecessary recursive dom_* calls, speeding up feasibility checks.
The following are comparative performance tests run on a laptop. The following test configuration is used to test without feasibility checking:
and the following is is contents of
config.toml
:Test results without the change in this PR:
Feasibility enabled:
Feasibility disabled:
Feasibility enabled:
Feasibility disabled:
Test results WITH the change in this PR:
Feasibility enabled:
Feasibility disabled:
Feasibility enabled:
Feasibility disabled: