Open LongyuZhang opened 2 weeks ago
Tested creating Pingperf checkpoint images with 0.46 release (grinder link) ,
docker-na.artifactory.swg-devops.com/sys-rt-docker-local/grinder/pingperf_17-openj9-ubi-9-linux_ppc-64-sw.os.rhel.8-hw.arch.ppc64le.p10:42970
docker-na.artifactory.swg-devops.com/sys-rt-docker-local/grinder/pingperf_17-openj9-ubi-8-linux_ppc-64-sw.os.rhel.8-hw.arch.ppc64le.p10:42969
then run these images inside podman container, with following commands multiple times
/opt/ol/wlp/bin/server start
/opt/ol/wlp/bin/server stop
/opt/ol/wlp/bin/server run defaultServer
Output is:
sh-4.4$ /opt/ol/wlp/bin/server start
Starting server defaultServer.
CWWKE0953W: This version of Open Liberty is an unsupported early release version.
Server defaultServer started with process ID 1026.
sh-4.4$ /opt/ol/wlp/bin/server stop
Stopping server defaultServer.
Server defaultServer stopped.
sh-4.4$ /opt/ol/wlp/bin/server run defaultServer
[AUDIT ] Launching defaultServer (Open Liberty 24.0.0.9-beta/wlp-1.0.92.cl240820240729-1903) on Eclipse OpenJ9 VM, version 17.0.12+7 (en_US)
[AUDIT ] CWWKT0016I: Web application available (default_host): http://363c9c36ea31:9080/pingperf/
[AUDIT ] CWWKC0452I: The Liberty server process resumed operation from a checkpoint in 0.061 seconds.
[AUDIT ] CWWKZ0001I: Application pingperf started in 0.062 seconds.
[AUDIT ] CWWKF0012I: The server installed the following features: [cdi-3.0, concurrent-2.0, jndi-1.0, jsonp-2.0, restfulWS-3.0, restfulWSClient-3.0, servlet-5.0].
[AUDIT ] CWWKF0011I: The defaultServer server is ready to run a smarter planet. The defaultServer server started in 0.067 seconds.
^C[AUDIT ] CWWKE0085I: The server defaultServer is stopping because the JVM is exiting.
[AUDIT ] CWWKE1100I: Waiting for up to 30 seconds for the server to quiesce.
[AUDIT ] CWWKT0017I: Web application removed (default_host): http://363c9c36ea31:9080/pingperf/
[AUDIT ] CWWKZ0009I: The application pingperf has stopped successfully.
Not able to reproduce the error. Wondering what extra tests we need to run to trigger the SCC?
How many iterations did you run?
@tjwatson FYI
How many iterations did you run?
~Around 10 iterations, I can increase to 50 to have a try.~ Tried iteration of start and stop 50 times, the same.
Did you have a link to the dockre files that you are using for the test?
We build semeru image first use https://raw.githubusercontent.com/ibmruntimes/semeru-containers/ibm/17/jdk/ubi/ubi9/Dockerfile.open.releases.full, then based on this image, we use https://github.com/OpenLiberty/ci.docker/blob/main/releases/latest/beta/Dockerfile.ubi.openjdk21 to build openliberty image. Then we build pingperf checkpoint on top of it. Detailed steps are in https://hyc-runtimes-jenkins.swg-devops.com/job/Grinder/42970/consoleFull
@tjwatson Do you know what we are doing differently from Liberty testing?
@tjwatson Do you know what we are doing differently from Liberty testing?
Our automated testing does not use container images. Instead it starts and stops various servers that will be using the same shared classes cache. But we have various other reports of the scripts used to build an application image also failing. Like the configure.sh
script which starts and stops the server many times.
@tjwatson Could you point us to the automated test that identified this issue? We’re interested in exploring the possibility of incorporating it into our testing pipeline to catch such issues earlier.
@tjwatson could you provide us some more info? Thanks
Based on Issue https://github.com/eclipse-openj9/openj9/issues/20012, OpenLiberty utilizes the established shared classes for multiple servers, so we need to increase Pingperf test to loop this test several times inside the container to validate the built shared classes. FYI @tajila @llxia