Proposal : BigMix - Run a combined STF load

Mesbah-Alam commented 3 years ago

The MiniMix load test runs a mixed load of bigdecimal(math), mauve, lang, nio, and concurrent loads.

@pshipton suggested that we can expand this "MiniMix" example and have a similar but larger, "BigMix" load - which combines loads run by some of the other individual 5m loads (such as DAA, Math, etc).

This new BigMix can then be run for a longer period of time (e.g. 10m, 20m etc. Moreover, it can replace the individual 5m variants, whose loads will have been incorporated in the BigMix.

Potential loads for BigMix may include the following (..and more):

Note: The "BigMix" may not include STF tests that need to run for specific features such as SharedClasses, Modularity, etc. It may only combine loads that are already running some form of "stress" loads.

Pros:

System test builds become faster.
We run a combined load for a long period of time which should exercise the JIT in a more "realistic" way (potential for new bugs).

Cons:

We might lose on coverage by not running each load separately for 5m. Need to investigate this further to know for sure.

@llxia @lumpfish @ShelleyLambert @pshipton - Requesting for your comments on the above.

lumpfish commented 3 years ago

Some history.....

The reason 'MiniMix' exists (it started out life as 'Mix' and became 'MiniMix') is that there is a need to provide a general workload so that the JVM is busy while other testing is being performed - e.g. the java.lang.management API tests. 'Mix' turned out to be too unstable - there was often an issue which caused the workload to fail which effectively blocked the tests which just want a background workload.

The reason the workload tests from different functional areas exist (and were run on an iteration basis rather than a timed basis) was to make it easier for 'simple' issues (those not requiring a complex mixed workload) to be debugged. If any of those tests failed, any 'Mix' containing them would be bound to fail also.

If only 'Mix' is run there is the danger than no systemtests will pass for significant periods of time.

Please provide a capacity planning estimate of the time required to run the proposed new workloads (workloads duration modes) so that the impact on available test machine resource can be assessed.

pshipton commented 3 years ago

Assuming Mix starts in a stable or mostly stable state, or we can achieve that state, we would endeavor to keep it that way. Having a test that doesn't take too long means it could be run on every applicable change before the change is merged. As it stands the special.system testing takes ~5+ hours (with 5x parallelization).

We could also run the various workloads with a small number of iterations (is 1 or 2 sufficient?) to catch simple problems.

pshipton commented 3 years ago

The idea is to stop running 6 (or however many) things in (lets say) 10 modes, and run 1 thing in those 10 modes instead. The end result being less time taken to test. Also a goal, as previously stated, to be able to run the testing before every merge.

Mesbah-Alam commented 3 years ago

there is a need to provide a general workload so that the JVM is busy while other testing is being performed - e.g. the java.lang.management API tests.

In my understanding - JLM, SharedClasses etc tests, that fall under the model of running a "workload" to occupy the JVM and then running tests while the workload is going on, will not be added as part of this "Mix". They will keep running as-is.

adoptium / aqa-tests

Proposal : BigMix - Run a combined STF load #2389