Closed llxia closed 1 year ago
Trying to narrow down the issue in the Open Liberty repo. The test can be passed with 30caf5d0e1d71cc9efefc3bec250a6c72c084168 (Grinder link). Grinder failed with 38c8145efc63adc3686466ef4a3322c20720330d (different error) (Grinder link)
00:01:40.541 Successfully tagged localhost/ol-instanton-test-pingperf:latest
00:01:40.541 3ec5e2856fc1c033dc4052d1de5c6384d32a273acabb69671a389b0e07d9bacb
00:01:40.541 create restore image ol-instanton-test-pingperf-restore ...
00:01:54.907 Performing checkpoint --at=afterAppStart
00:01:55.324
00:03:21.531 CWWKE0954E: The specified (afterappstart) checkpoint phase is empty or unknown.
00:03:21.531
00:03:22.538 -----------------------------------
00:03:22.538 criu_pingPerf_testCreateRestoreImageAndPushToRegistry_0_FAILED
00:03:22.538 -----------------------------------
FYI @tajila
The PingPerf test is excluded temporarily on zlinux and alinux.
Also noticed exec /bin/sh: exec format error
on plinux and zlinux
00:08:13.437 [2/2] STEP 10/23: COPY fixes/ /opt/ol/fixes/
00:08:14.706 --> 7c4b2fa0f46
00:08:14.707 [2/2] STEP 11/23: RUN set -eux; ARCH="$(uname -m)"; case "${ARCH}" in aarch64|arm64) DUMB_INIT_URL='https://github.com/Yelp/dumb-init/releases/download/v1.2.5/dumb-init_1.2.5_aarch64'; DUMB_INIT_SHA256=b7d648f97154a99c539b63c55979cd29f005f88430fb383007fe3458340b795e; ;; amd64|x86_64) DUMB_INIT_URL='https://github.com/Yelp/dumb-init/releases/download/v1.2.5/dumb-init_1.2.5_x86_64'; DUMB_INIT_SHA256=e874b55f3279ca41415d290c512a7ba9d08f98041b28ae7c2acb19a545f1c4df; ;; ppc64el|ppc64le) DUMB_INIT_URL='https://github.com/Yelp/dumb-init/releases/download/v1.2.5/dumb-init_1.2.5_ppc64le'; DUMB_INIT_SHA256=3d15e80e29f0f4fa1fc686b00613a2220bc37e83a35283d4b4cca1fbd0a5609f; ;; s390x) DUMB_INIT_URL='https://github.com/Yelp/dumb-init/releases/download/v1.2.5/dumb-init_1.2.5_s390x'; DUMB_INIT_SHA256=47e4601b152fc6dcb1891e66c30ecc62a2939fd7ffd1515a7c30f281cfec53b7; ;; *) echo "Unsupported arch: ${ARCH}"; exit 1; ;; esac; curl -LfsSo /usr/bin/dumb-init ${DUMB_INIT_URL}; echo "${DUMB_INIT_SHA256} */usr/bin/dumb-init" | sha256sum -c -; chmod +x /usr/bin/dumb-init;
00:08:15.077 exec /bin/sh: exec format error
00:08:17.707 Error: building at STEP "RUN set -eux; ARCH="$(uname -m)"; case "${ARCH}" in aarch64|arm64) DUMB_INIT_URL='https://github.com/Yelp/dumb-init/releases/download/v1.2.5/dumb-init_1.2.5_aarch64'; DUMB_INIT_SHA256=b7d648f97154a99c539b63c55979cd29f005f88430fb383007fe3458340b795e; ;; amd64|x86_64) DUMB_INIT_URL='https://github.com/Yelp/dumb-init/releases/download/v1.2.5/dumb-init_1.2.5_x86_64'; DUMB_INIT_SHA256=e874b55f3279ca41415d290c512a7ba9d08f98041b28ae7c2acb19a545f1c4df; ;; ppc64el|ppc64le) DUMB_INIT_URL='https://github.com/Yelp/dumb-init/releases/download/v1.2.5/dumb-init_1.2.5_ppc64le'; DUMB_INIT_SHA256=3d15e80e29f0f4fa1fc686b00613a2220bc37e83a35283d4b4cca1fbd0a5609f; ;; s390x) DUMB_INIT_URL='https://github.com/Yelp/dumb-init/releases/download/v1.2.5/dumb-init_1.2.5_s390x'; DUMB_INIT_SHA256=47e4601b152fc6dcb1891e66c30ecc62a2939fd7ffd1515a7c30f281cfec53b7; ;; *) echo "Unsupported arch: ${ARCH}"; exit 1; ;; esac; curl -LfsSo /usr/bin/dumb-init ${DUMB_INIT_URL}; echo "${DUMB_INIT_SHA256} */usr/bin/dumb-init" | sha256sum -c -; chmod +x /usr/bin/dumb-init;": while running runtime: exit status 1
00:08:17.708 -----------------------------------
00:08:17.708 criu_pingPerf_testCreateRestoreImageAndPushToRegistry_0_FAILED
The exec /bin/sh: exec format error
happens because when we build the Liberty image we are using the x86-64 Semeru image as a base, rather than the aarch64/ppc64le/s390x images:
[2023-06-17T15:33:55.302Z] [2/2] STEP 1/23: FROM icr.io/appcafe/ibm-semeru-runtimes:open-17-ea-jdk-ubi-amd64
[2023-06-17T15:33:55.302Z] WARNING: image platform (linux/amd64) does not match the expected platform (linux/arm64)
The Liberty Dockerfile at https://github.com/OpenLiberty/ci.docker/blob/main/releases/latest/beta/Dockerfile.ubi.openjdk17 is referencing an architecture-specific base image unfortunately:
FROM icr.io/appcafe/ibm-semeru-runtimes:open-17-ea-jdk-ubi-amd64
...
where as the other Dockerfiles reference architecture-agnostic base images, e.g.:
FROM ibm-semeru-runtimes:open-17-jre-focal
...
Re: exec /bin/sh: exec format error
the test seems to first build a Semeru image for the current platform and tags it as:
[2023-06-17T15:30:54.952Z] Successfully tagged localhost/local-ibm-semeru-runtimes:latest
[2023-06-17T15:30:55.393Z] 5f81856e35c2f07dde8b6df5734367f3583f5ce531d257cf50ce63011e926575
but when building the Liberty image we pull the amd64 image. I'm guessing we want to build the Liberty image on top of the local Semeru image, so we need to modify the Liberty image Dockerfile FROM
line.
The ub22 thread Checkpoint failed
is the same problem, but it looks different because somehow the Liberty image is successfully built because the x86-64 binaries are able to run on aarch64:
[2023-06-17T15:28:21.602Z] ++ uname -m
[2023-06-17T15:28:21.602Z] + ARCH=x86_64
[2023-06-17T15:28:21.602Z] + case "${ARCH}" in
[2023-06-17T15:28:21.602Z] + DUMB_INIT_URL=https://github.com/Yelp/dumb-init/releases/download/v1.2.5/dumb-init_1.2.5_x86_64
[2023-06-17T15:28:21.602Z] + DUMB_INIT_SHA256=e874b55f3279ca41415d290c512a7ba9d08f98041b28ae7c2acb19a545f1c4df
[2023-06-17T15:28:21.602Z] + curl -LfsSo /usr/bin/dumb-init https://github.com/Yelp/dumb-init/releases/download/v1.2.5/dumb-init_1.2.5_x86_64
I don't know how Docker on aarch64 works and how it can do the above, but once the image is built and we try to start the server it fails.
Possibly the aarch64 tests are running on a Mac? Google tells me x86-64 Docker images can run on Macs via some emulation magic, so maybe that explains how the ub22 test got much further?
Details about PingPerf tests:
instanton
branch, not main
branch: https://github.com/OpenLiberty/ci.docker/blob/instanton/releases/latest/beta/Dockerfile.ubi.openjdk17FROM
line to pick up the local Semeru image: https://github.com/adoptium/aqa-tests/blob/165701fe12cfd961db19d2e130e5ca6f20ae79bf/external/criu/pingPerf.sh#L97I think the issue is due to a recent Liberty change - switch to multi-stage builds
https://github.com/OpenLiberty/ci.docker/commit/7aaf9c52f4aff99cf850d5fd37f83293afc773ed
Liberty image Dockerfile has 2 FROM
https://github.com/OpenLiberty/ci.docker/blob/instanton/releases/latest/beta/Dockerfile.ubi.openjdk17#L1
https://github.com/OpenLiberty/ci.docker/blob/instanton/releases/latest/beta/Dockerfile.ubi.openjdk17#L27
Since we only replace the first FROM
- icr.io/appcafe/ibm-semeru-runtimes:open-17-jdk-ubi
, the second FROM
will still pull icr.io/appcafe/ibm-semeru-runtimes:open-17-ea-jdk-ubi-amd64
, which cause the test to fail.
@ymanton do you know why Liberty uses two Semeru docker images?
@ymanton do you know why Liberty uses two Semeru docker images?
At one point we were using the Semeru EA images for the Liberty beta images. But now for https://github.com/OpenLiberty/ci.docker/pull/412 we want the Liberty UBI beta images based on icr.io/appcafe/ibm-semeru-runtimes:open-17-jdk-ubi
but it looks like there is an issue with the second FROM
. It needs to be updated to use icr.io/appcafe/ibm-semeru-runtimes:open-17-jdk-ubi
also.
Once the Liberty dockerfile is finalized, we will update to use releases
version (not beta
) : https://github.com/OpenLiberty/ci.docker/blob/6b1c9dc9395ada7006da9fd0ebdb485602be98ea/releases/latest/full/Dockerfile.ubi.openjdk11
https://github.com/OpenLiberty/ci.docker/pull/412 has been updated to use the correct FROM (not to use the EA semeru image).
01:00:06.032 [ERROR ] CWWKF0001E: A feature definition could not be found for checkpoint-1.0
Do you still see this error? Before we auto configured this feature in our beta images. But we should no longer be doing that. Does your server.xml
file you are using configure checkpoint-1.0
Liberty feature? If so you can stop doing that once we have the Liberty GA for InstantOn. We removed the need for this feature for Liberty InstantOn GA.
Thanks @tjwatson. The PingPerf test passed with https://github.com/OpenLiberty/ci.docker/pull/412. Grinder Grinder
Hi @tjwatson , I noticed that https://github.com/OpenLiberty/ci.docker/pull/412 is merged into main
branch. When will it be ported into the instanton
branch? Or should we switch to using main
branch: https://github.com/OpenLiberty/ci.docker/blob/main/releases/latest/beta/Dockerfile.ubi.openjdk17?
going forward use the main branch. The instanton
branch will be abandoned (maybe removed) at some point.
This issue is resolved. Thanks, everyone!
On ub22,
thread Checkpoint failed
: Internal buildOn rhel9,
exec container process
/bin/sh: Exec format error
: internal buildNoticed the same issue on JDK11 and JDK17.
Using dockerfile from OpenLiberty/ci.docker repo branch instanton with commit hash 3fbc2789ee736701f729febb747082ff9cbbd170
Using dockerfile from OpenLiberty/ci.docker repo branch instanton with commit hash ee87dfa7f7c7de01d12786aa71517fa8f4007883
Using dockerfile from OpenLiberty/ci.docker repo branch instanton with commit hash 3fbc2789ee736701f729febb747082ff9cbbd170
, which the same sha as the one that has issues abovehttps://github.com/OpenLiberty/ci.docker/compare/ee87dfa7f7c7de01d12786aa71517fa8f4007883...3fbc2789ee736701f729febb747082ff9cbbd170 We use
releases/latest/beta/Dockerfile.ubi.openjdk17
from https://github.com/OpenLiberty/ci.docker.gitinstanton
branch