GeoIPS integration tests are slow because they require running GeoIPS end-to-end for each test. We will improve the integration test execution time by implementing a method for running the tests in parallel.
We will research methods of running GeoIPS jobs in parallel on a single system. This could be as simple as a Python script that calls GeoIPS multiple times via subprocess, multithreading, or multiprocessing. It could also take the form of shell scripts or employ other languages or off-the-shelf tools if appropriate.
Scope
This should be implemented entirely within the GeoIPS package. Any additional dependencies should be installable by including them in the dependencies defined in pyproject.toml. No assumptions should be made about the available hardware or availability of processing queues and other non-standard software.
Parallelization, for this issue, should be achieved by running multiple GeoIPS jobs in parallel on a single system. It should not be achieved by parallelizing the actual GeoIPS code or submitting GeoIPS jobs to a distributed or cluster processing queue.
Goal
Improve the speed of the integration tests by allowing them to run in parallel. This should result in a CLI option for the integration tests scripts that allows specifying the number of parallel jobs to be executed. Results from each job will need to be collected and reported to the top-level log file in a repeatable way where the tests appear in the same order, regardless of the order in which they actually execute.
When complete, what is new?
[ ] Integration tests can run in parallel
[ ] Integration test scripts provide a CLI flag to specify number of parallel jobs
[ ] Job results are all collected into a single top-level log file
[ ] Top-level log file should report test results in a repeatable order (alphabetized by test name?) rather than in the order of completion
Requested Update
Description
GeoIPS integration tests are slow because they require running GeoIPS end-to-end for each test. We will improve the integration test execution time by implementing a method for running the tests in parallel.
We will research methods of running GeoIPS jobs in parallel on a single system. This could be as simple as a Python script that calls GeoIPS multiple times via subprocess, multithreading, or multiprocessing. It could also take the form of shell scripts or employ other languages or off-the-shelf tools if appropriate.
Scope
This should be implemented entirely within the GeoIPS package. Any additional dependencies should be installable by including them in the dependencies defined in
pyproject.toml
. No assumptions should be made about the available hardware or availability of processing queues and other non-standard software.Parallelization, for this issue, should be achieved by running multiple GeoIPS jobs in parallel on a single system. It should not be achieved by parallelizing the actual GeoIPS code or submitting GeoIPS jobs to a distributed or cluster processing queue.
Goal
Improve the speed of the integration tests by allowing them to run in parallel. This should result in a CLI option for the integration tests scripts that allows specifying the number of parallel jobs to be executed. Results from each job will need to be collected and reported to the top-level log file in a repeatable way where the tests appear in the same order, regardless of the order in which they actually execute.
When complete, what is new?