InsightSoftwareConsortium / ITK

Insight Toolkit (ITK) -- Official Repository. ITK builds on a proven, spatially-oriented architecture for processing, segmentation, and registration of scientific images in two, three, or more dimensions.
https://itk.org
Apache License 2.0
1.47k stars 686 forks source link

COMP: tagged some slow tests to hopefully allow them to pass on cdash #5172

Closed seanm closed 2 weeks ago

blowekamp commented 2 weeks ago

For reference here is the run for the nightly Valgrind, that is publicly managed in the SimpleITK account: https://github.com/SimpleITK/SimpleITK/actions/runs/12944566489/job/36105670713 https://github.com/SimpleITK/SimpleITK/blob/master/.github/workflows/Nightly.yml#L111-L145

I believe these tests are already excluded by the above script as they are marked as "RUNS_LONG". For the valgrind I explicitly set the ITK number of threads to only 2, and use a many core system to try to run valgrind as efficiently as possible. The changes do look fine, but in general I have a concern about to many serial/resource lock tests decreasing the efficiency of a many cored system.

seanm commented 2 weeks ago

I had asked about RUNS_LONG during the hackathon yesterday, and understood that that doesn't actually make a test run alone by itself.

The difficulty is that some tests themselves are highly multithreaded and if run with other tests via set(CTEST_TEST_ARGS PARALLEL_LEVEL 10) then everything gets slow, and timeouts are hit.

I'm open to other ways to get these passing https://open.cdash.org/viewTest.php?onlyfailed&buildid=10165388 instead of timing out...

blowekamp commented 2 weeks ago

The difficulty is that some tests themselves are highly multithreaded and if run with other tests via set(CTEST_TEST_ARGS PARALLEL_LEVEL 10) then everything gets slow, and timeouts are hit.

Agreed. That is why I set ITK_GLOBAL_DEFAULT_NUMBER_OF_THREADS to 2 for the valgrind build.

I deal with this the same issues on clusters concurrently batch processing images. It is more efficient CPU utilization to process more tasks (tests) with fewer threads, than a few tasks(tests) with many threads. They are different approaches which require different approaches. Bad things happen when you launch 48 tasks which each try to get 48 threads.

As I said these test are already marked long, and this tunning will not effect the already tuned valgrind build.

hjmjohnson commented 2 weeks ago

FYI: My informal testing of ITK algorithms with MRI brain imaging (i.e. 40mb images of brains for registration and filtering operations). This indicated that multithreading for that data size improved significantly up to 4 threads, improved some up to 6 threads, and then only marginally improved to about 10 threads when performance often started to decline.

The correct number of threads is heavily data and algorithm dependant. The "rules of thumb" above have served me well for getting most of the value out of multithreading without doing extensive or definitive optimizations.

hjmjohnson commented 2 weeks ago

FYI: I have used test fixtures to address these types of issues in other projects:

https://crascit.com/2016/10/18/test-fixtures-with-cmake-ctest/

seanm commented 2 weeks ago

Agreed. That is why I set ITK_GLOBAL_DEFAULT_NUMBER_OF_THREADS to 2 for the valgrind build.

Oh that's a good idea. I can try that. Is ITK_DEFAULT_MAX_THREADS basically the same? One is a cmake setting, the other is an env var?

blowekamp commented 2 weeks ago

ITK_DEFAULT_MAX_THREADS

Oh that's a good idea. I can try that. Is ITK_DEFAULT_MAX_THREADS basically the same? One is a cmake setting, the other is an env var?

For the purpose of testing I think the effect will be the same, but with the cmake variable set and built in, the binaries will never be able to use more threads than that. ( Historically this was the size of an array which kept track of the threads as I recall)

dzenanz commented 2 weeks ago

Yes. EnvVar has lowest priority.

seanm commented 2 weeks ago

OK I set ITK_DEFAULT_MAX_THREADS=2 on Rogue19. I've left CTEST_TEST_ARGS PARALLEL_LEVEL 8. It is an 8 core Apple M1. We'll see cdash tomorrow...

seanm commented 2 weeks ago

Unexpectedly, Rogue19 has more test failures today: https://open.cdash.org/viewTest.php?onlyfailed&buildid=10167705 Example:

593: Test command: /Users/builder/external/ITK-AppleClang-dbg-Universal/bin/ITKCommon2TestDriver "--remove-env" "ITK_GLOBAL_DEFAULT_NUMBER_OF_THREADS" "itkMultiThreadingEnvironmentTest" "88"
593: Working Directory: /Users/builder/external/ITK-AppleClang-dbg-Universal/Modules/Core/Common/test
593: Environment variables:
593:  NSLOTS=88
593: Test timeout computed to be: 900
593: ERROR: Wrong number of maximum number of threads set from environment. 88 != 2
1/1 Test #593: itkMultiThreadingEnvTest88 .......***Failed    0.08 sec
seanm commented 1 week ago

The tests look wrong to me. They seem to assume that ITK_DEFAULT_MAX_THREADS can't be small.

WIP patch: https://github.com/InsightSoftwareConsortium/ITK/pull/5191