charmplusplus / charm

The Charm++ parallel programming system. Visit https://charmplusplus.org/ for more information.
Apache License 2.0
200 stars 50 forks source link

Cut long-running tests & examples out of 'make test' #595

Open PhilMiller opened 9 years ago

PhilMiller commented 9 years ago

Original issue: https://charm.cs.illinois.edu/redmine/issues/595


Identify tests/examples that take substantial time to run in the current make test. Either find parameters that make them run faster, or remove them from make test if they don't exercise features not covered by other tests. If necessary, add a separate make perftest target that runs them with parameters that would generate meaningful benchmark numbers.

Screen Shot 2014-11-03 at 11 30 59 PM

xiangni commented 5 years ago

Original date: 2014-11-04 05:15:49


Running all the test only takes 80s. Here is the time break down for each directory: AMPI 8 Charm++ 37 Converse 34 FEM 0.2 Util 0.02

However, there are tests in the test directory but never be run by auto build. Some of them even crash. Attached is the file of the list along with the name of the person that either wrote it or made last significant modifications.

AMPI

    chkpt               fails                       Gengbin
fallreduce              compile fails       Gengbin
jacobi3d                no test target      Esteban
mpich-test      no test target      
speed               works!                      Gengbin

Charm++

    kNeighbor       no test target      Chao
array4D                 crash                   Abhinav
pmetest             compile fails       Sameer
arrayPerf               works!                      Sameer
python              compile fails       Filippo
topology                works!                      Abhinav
broadcast       works!                      Phil
io                      no test target      Phil
reductiontesting  no test target        Rahul
jacobi3d                no test target          used for fttest 
commSpeed       works!                      Terry
jacobi3d-gauss   no test target     Yanhua
penciltest              compile fails       Sameer
commtest        crash                       Abhishek
jacobi-sdag     no test target          used for fttest 
ping                        crash                       Yanhua
startuptest     compile fails       Eric
hello-crosscorruption    hangs      

Note: for formated version please look at the attached file.

xiangni commented 5 years ago

Original date: 2014-11-04 05:31:31


Running all the test only takes 80s. Here is the time break down for each directory: AMPI 8 Charm++ 37 Converse 34 FEM 0.2 Util 0.02

However, there are tests in the test directory but never be run by auto build. Some of them even crash. Here is the list along with the name of the person that either wrote it or made last significant modifications. AMPI chkpt fails Gengbin fallreduce compile fails Gengbin jacobi3d no test target Esteban mpich-test no test target
speed works! Gengbin Charm++ kNeighbor no test target Chao array4D crash Abhinav pmetest compile fails Sameer arrayPerf works! Sameer python compile fails Filippo topology works! Abhinav broadcast works! Phil io no test target Phil reductiontesting no test target Rahul jacobi3d no test target used for fttest commSpeed works! Terry jacobi3d-gauss no test target Yanhua penciltest compile fails Sameer commtest crash Abhishek jacobi-sdag no test target used for fttest ping crash Yanhua startuptest compile fails Eric hello-crosscorruption hangs

PhilMiller commented 5 years ago

Original date: 2014-11-04 20:21:16


Within tests/charm++ and tests/converse, there seem to be just a few things that make up the bulk of the 30 seconds each, and that add little in the way of correctness testing. The most noticeable is tests/charm++/queue/msgtest, with pgm in the same directory also being long-ish. tests/charm++/communication_overhead is similar. It's worth testing that the runtime doesn't fall over with long queues or large messages, but we don't need to hammer on them for nearly so long. charm++/taskSpawn{,Recursive} take 8 and 4 seconds, respectively, when their parameters could be reduced to run shorter.

xiangni commented 5 years ago

Original date: 2014-11-04 21:49:25


Total time to run examples is 46s charm++: 13s converse: 0.317s ampi: 32s armci: 1.1s

stwhite91 commented 5 years ago

Original date: 2016-03-08 22:13:41


On my lab machine with a 'netlrts-linux-x86_64 --with-production' build, the total time to run 'tests' is 71s: charm++: 39s converse: 22s ampi: 9s fem: 0.2s util: 0.01s

The total time to run 'examples' is 31s: charm++: 28s converse: 0.1s ampi: 2s armci: 0.7s

Are these runtimes acceptable or not? Also, the perftest/ directory doesn't look like it does anything at all ... should we remove it?

ericjbohm commented 5 years ago

Original date: 2017-04-13 21:04:11


Here is an individual test time breakdown for make test netlrts-linux-x86_64 smp --with-production on intellect https://docs.google.com/a/illinois.edu/spreadsheets/d/1xpCPCNWrrEIZmHGtAC_K1bkor2SOIDuxhp7kkbZq0x8/edit?usp=sharing

359s total time. However the wallclock time is somewhat larger as only the time of each testrun execution is recorded.

Only 5 tests require more than 10s.

Test time(s) commbench/pgm 86.007 machinetest/multiping 19.88 megampi/pgm 16.885 Cjacobi3D/jacobi 14.72 kNeighbor/kNeighbor 10.774

So we have one major issue in that commbench can take long time. Then we have a minor issue in that each test takes over a second, so having over ninety of them means there is another minute and half minimum.

stwhite91 commented 5 years ago

Original date: 2017-04-13 21:15:01


We want megampi/pgm and Cjacobi3D/jacobi to run for more than a few seconds to test AMPI messaging/collectives/migration all in one go.

I think commbench and multiping can be cut down in runtime a bit...

ericjbohm commented 5 years ago

Original date: 2017-04-13 22:08:46


The two commbench tests which take the longest are pingpong and flood. Each of these has entirely hard coded parameters. The iteration counters hard coded in to these tests appear to massively overkill the number required for a reasonable accuracy.

Reducing them by an order magnitude (or a factor of 2 for relatively small <1e2) cases cuts runtime down by a factor of 3.

evan-charmworks commented 3 years ago

megacon, Converse pingpong and pingpong_multipairs, examples/charm++/load_balancing/kNeighbor, and benchmarks/charm++/kNeighbor are tests that take more than a minute on our MPI Linux SMP CI.