General questions to orient possible contributions

Helveg commented 3 years ago

Hi there!

Nice project, the size of one of my projects is starting to explode in complexity and now I've made it worse for myself added an MPI layer. I run into ALL kinds of issues with synchronizing tests among MPI processes using python's unittest module. I'd maybe want to swap to pytest and your plugin. Some questions:

Does pytest run tests in a deterministic order? Any invisible threading or parallelism going on or anything that interferes with assumptions made when debugging MPI applications?
Is there any support or API to track which tests have been run in what order? This can be important when certain MPI processes get deadlocked because of other MPI processes running other tests or being stuck in extra Barrier calls from across test functions
How does pytest deal with MPI? Gracious error handling?
What is your personal experience and setup to make sure that each test function is truely contained across MPI processes? (as mentioned earlier, imagine 1 process doing an extra barrier and all the other processes continuing and getting stuck elsewhere).
Are there any timeout mechanisms to make sure tests get aborted if they run too long? (On a per test basis would be a plus)

I know it's a ton of questions! But I'm seriously considering switching and making hefty contributions to this package where MPI users across the world need them ;p I think I can contribute quite well with some experience with some other MPI tooling attempts: a pool implementation, an across-MPI locking package with some neat read/write/collective priority locking

aragilar commented 3 years ago

The main purpose of pytest-mpi has been to assist with using MPI with h5py's tests (ensuring that tests that require MPI only run under MPI, those that do not work under MPI are not run under MPI etc.). It hasn't so far dealt with the more intricate parts of MPI (the most complex thing it's done is provide tempdir/tempfile fixtures that work with MPI).

As for your questions:

As far as I know, the ordering of tests is deterministic (there's a pytest plugin which makes them random—coupling this with tox which sets a base seed should make this MPI-safe, though I have not tested that).
I haven't encountered any invisible threading or parallelism, those also I think are pushed off to other plugins (e.g. xdist).
Tracking tests could likely be done with a plugin (which could be part of pytest-mpi).
I've been doing mpirun -n <n> python -m pytest <pytest-args>, so I'm not effectively using pytest's error handling, and deadlocks are an issue (improvements in this area would be useful). Flipping this so pytest calls MPI may mean that other pytest plugins which handle timeouts etc. could be used, but how that sets up python I've not experimented with.

I've been meaning to move this to the pytest-dev organisation, so that it's more obvious how to contribute, but that's fallen off my todo list for now.

Helveg commented 3 years ago

Ok cool, thanks for your response! I'll start on a test-tracker and timeout feature for deadlocks for this plugin! Seems like a useful addition.

As for

Flipping this so pytest calls MPI

I think that could be done if MPI_Spawn is available, pytest can just spawn the n desired processes running a test and get the results.

ewu63 commented 3 years ago

FWIW, there is another MPI testing tool out there called testflo, which only supports unittest unfortunately. But it does spawn MPI processes itself (so each test is using the specified number of processors) and handles a lot of the errors nicely. It has timeouts and some other features too like memory profiling. We have been using it successfully for some time now, but we're also evaluating other options such as pytest-mpi which could provide other features in pytest that aren't available to us at the moment.

Some of the nice-to-have features that we'd be interested in are:

the ability for pytest-mpi to spawn the MPI processes (as done by testflo) instead of calling pytest via mpirun. As far as I can tell, this is the only way to ensure that each test is being run with its own specified number of processors
some sort of job scheduling ability instead of just spawning as many processes as needed, despite possibly oversubscribing. This may be outside the scope of these plugins though
easy parameterization of running the same test with different number of processors

The latter two are not available with testflo yet, so we're just keeping an eye out for any developments elsewhere that could prompt a switch.

aragilar commented 3 years ago

@nwu63 Cool, didn't know about testflo. I don't have much time for major development on pytest-mpi, but if you want to add support for calling tests under MPI, I'm happy to look at PRs. I suspect point 2 on your list would be quite complex, as you need to have something which runs the tests in parallel but also tracks the number of MPI processes.

I suspect if you got point 1 working, then you could reuse pytest's parametrisation framework to get 3 for free.

ewu63 commented 3 years ago

Yeah I agree with everything you said there @aragilar. Unfortunately I'm also quite busy these days, but I'll keep this repo in mind and perhaps contribute code in the future. Cheers.

aragilar / pytest-mpi

General questions to orient possible contributions #32