adoptium / openj9-systemtest

Long running J9 tests
Other
6 stars 38 forks source link

Reduce cache to 1/4th its size so cache is 100% full at final stage #70

Closed Mesbah-Alam closed 5 years ago

Mesbah-Alam commented 5 years ago

Reduce cache to 1/4th its size so cache is 100% full at final stage Signed-off-by: Mesbah_Alam@ca.ibm.com Mesbah_Alam@ca.ibm.com

Mesbah-Alam commented 5 years ago

Related : https://github.com/eclipse/openj9/issues/4139

Mesbah-Alam commented 5 years ago

Tested : https://ci.adoptopenjdk.net/view/Test_grinder/job/Grinder/814/tapResults/

Mesbah-Alam commented 5 years ago

Background :

In all the new failed cases, the point at which the cache is expected to be above 50% full (relative to CACHESIZE_SOFTLIMIT_INTERMIEDIATE_INCREASED_MB=25mb), it is ending up being very close (e.g. 48% full) but not exactly more than 50% full.

The value for "CACHESIZE_SOFTLIMIT_INTERMIEDIATE_INCREASED_MB" was recently updated to 25mb (from 20mb)-- which unfortunately is ending up to be slightly too much on certain platforms. Note that it was increased from its original value(20mb) after the opposite side of the problem was observed-- i.e., cache being 100% full, when it's expected to be above 50% but below 100% (the original problem for which this issue was raised). So, decreasing the "intermediate increased softlimit" below 25mb might lead us back to that original issue again.

The usecase for this test has a last test - where, after cache content has reached above 50%, it resets the softlimit to half of its size (e.g. 25mb/2) in order to create a situation where the cache would be 100% full for certain. Then it tries to write to the cache, generating an error confirming 100% filled up cache.

A new thing to calibrate is: in the last step, instead of reducing the cache size to 1/2 of its current size, reduce it to an even smaller size -- e.g. 1/4th. This would still result in the desired state -- e.g. cache will be 100% full post size reduction; meeting usecase requirement. This way we can omit the check altogether which demands the cache to be exactly above 50% and below 100% full at the middle stage.

(Note: It is likely due to some changes made in JIT (?) that is causing fluctuations in the amount of cache content being generated and we are having to re-calibrate the test time and again).

Mesbah-Alam commented 5 years ago

Hi Peter, could you please review and merge this PR ?

@pshipton

pshipton commented 5 years ago

Looks good now, although we should check the test is passing on all platforms before merging.

Mesbah-Alam commented 5 years ago

Test passing on

Windows: https://ci.adoptopenjdk.net/view/Test_grinder/job/Grinder/820/ s390 linux: https://ci.adoptopenjdk.net/view/Test_grinder/job/Grinder/819/ ppc64le linux: https://ci.adoptopenjdk.net/view/Test_grinder/job/Grinder/818/ x86-64 linux: https://ci.adoptopenjdk.net/view/Test_grinder/job/Grinder/816/ osx (5x Grinder): https://ci.adoptopenjdk.net/view/Test_grinder/job/Grinder/815/ openjdk_ppc64_aix : tested to pass in internal Jenkins