Clarify concurrency settings

ewu63 commented 4 years ago

I'm still a bit confused on exactly how the -n <num> command line option interacts with the N_PROCS variable defined in individual tests. Can you confirm that the follow scenarios are correct?

If I specify -n 1 but there is a test that requires 2 procs, then it will still run and use 2 procs if they are available, or MPI will virtualize 2 procs if there is only one.
If I specify -n 2 when there are two tests, one with 2 procs and one with 3, then they will run one after the other. In this case, there is no difference between -n 2 and -n 1 since they will both run serially.

I also have some questions which may lead to suggestions for enhancement:

Is it possible to have N_PROCS set on an individual test basis? It's a little limiting to have to specify the same number of procs for all tests under a given class.
By default, testflo will run with -n <num> where <num> is set to the number of threads rather than procs. It would be nice to have a way to set it to procs, since we typically do not oversubscribe in the sense of running one process per thread. That way we will not have to hard-code -n <num>, so that testflo can utilize all the resources on different machines.

EDIT: I did some testing and it seems that -n <num> is the total number of processes that are spawn. So if I do -n 1 with N_PROCS=2, even if I have 2 physical cores available, only one process will be spawn and MPI will virtualize 2 procs on one. Is that the case?

naylor-b commented 4 years ago

The -n option just specifies the number of tests that will be run concurrently. It has no impact on the number of MPI processes that are allocated for an MPI test. That allocation is determined by the value of the N_PROCS class attribute of your TestCase class. Currently there is no way to specify N_PROCS for a specific test. By default, testflo sets the value of '-ntomultiprocessing.cpu_count()`, which returns the number of virtual cores.

ewu63 commented 4 years ago

So to clarify, assume I have 4 threads available, and three tests each with N_PROCS=2:

with -n 1, the tests will run serially even though I have enough threads for two tests to run at the same time. In this case, I will be using just half of my procs right?
with -n 2, the first two tests will now run concurrently and use all the available procs
without the -n option, it defaults to -n 4. Now will MPI spawn additional threads to run all three tests concurrently, or will it run the first two, then the third one after one test finishes?

As for physical vs. logical cores, would you be willing to either change the default behaviour to set -n to the number of physical cores, or provide a flag to do so? A quick search shows that psutil.cpu_count(logical=False) will return the physical core count, but this would add on an additional dependency for testflo.

naylor-b commented 4 years ago

If you use -n 1, it would run the 3 tests one at a time, and each test would use 2 procs. With -n 2, it would run 2 of the tests concurrently and the third would run as soon as one of the first two finished. All tests would again use 2 procs. In the default case, with -n 4, all 3 tests would run concurrently and each would again use 2 procs

Each MPI test is run as a subprocess of the 'test runner' process (there are n test runners determined by the value of -n), and that MPI subprocess will allocate however many procs are specified by N_PROCS. If there were a straightforward way, without introducing external dependencies, to default to the number of physical procs I'd be fine with that, but I don't know of any.

ewu63 commented 4 years ago

Thanks for the explanation, this makes a lot more sense now. I would just suggest that some of the documentation can be improved a bit. For example, testflo -h says

-n NUM_PROCS, --numprocs NUM_PROCS
   Number of processes to run. By default, this will use the number of CPUs available. To force serial execution, specify a value of 1.

This suggested that -n controlled the number of total processes available for all tests to share, rather than the number of concurrent tests. I think a lot of my confusion stems from misunderstanding this. Saying Number of concurrent tests to run or something would be helpful.

Understood about testflo not supporting different N_PROCS for each test. That would be a really nice feature but I can see complications there, since for example different number of procs may call the same setUp() depending on the test.

Feel free to close this issue.

naylor-b commented 4 years ago

I put up a PR that changes the help text to:

  -n NUM_TEST_PROCS, --numprocs NUM_TEST_PROCS
                        Number of concurrent test processes to run. By default, this will use the number of virtual processors
                        available. To force tests to run consecutively, specify a value of 1.

OpenMDAO / testflo

Clarify concurrency settings #43