adoptium / infrastructure

This repo contains all information about machine maintenance.
Apache License 2.0
84 stars 100 forks source link

Set up Ubuntu 24.04 systems to be able to support it for Temurin #3501

Open sxa opened 2 months ago

sxa commented 2 months ago

Ubuntu 24.04 will be out later this month. We should look at our existing inventory and make some of these available. This will involve:

Haroon-Khel commented 2 months ago

Ubuntu 2404 static docker container online in jenkins https://ci.adoptium.net/computer/test-docker-ubuntu2404-x64-1/

Will run the aqa test pipeline on it shortly

EDIT: https://ci.adoptium.net/job/AQA_Test_Pipeline/237/console

Haroon-Khel commented 2 months ago

Identifying older systems that can be replaced with Ubuntu 24.04 without restricting testing.

In regards to replaceable static docker containers, at the moment we have 3 ubuntu 2004 x64 nodes, 9 arm64 ubuntu 2004 nodes, 1 ubunutu 1804 node, 6 arm32 ubuntu 2004 nodes and 3 ubuntu 2204 ppc64le nodes. I can remove one from each and replace it with a ubuntu 2404 node of that architecture, and make a brand new ubuntu 2404 s390x node on dockerhost-marist-ubuntu2204-s390x-1

sxa commented 2 months ago

Identifying older systems that can be replaced with Ubuntu 24.04 without restricting testing.

In regards to replaceable static docker containers, at the moment we have 3 ubuntu 2004 x64 nodes, 9 arm64 ubuntu 2004 nodes, 1 ubunutu 1804 node, 6 arm32 ubuntu 2004 nodes and 3 ubuntu 2204 ppc64le nodes. I can remove one from each and replace it with a ubuntu 2404 node of that architecture, and make a brand new ubuntu 2404 s390x node on dockerhost-marist-ubuntu2204-s390x-1

SGTM - we can start with that and then probably look at migrating more of the 2004 ones up given how many we have. I hadn't spotted that we don't even have 22.04 on arm32.

Haroon-Khel commented 2 months ago

https://ci.adoptium.net/computer/test-docker-ubuntu2404-armv8-1/ https://ci.adoptium.net/job/AQA_Test_Pipeline/239/console

https://ci.adoptium.net/computer/test-docker-ubuntu2404-s390x-1/ https://ci.adoptium.net/job/AQA_Test_Pipeline/240/console

Getting unusual errors while building the ppc64le and arm32 images

 > [ 7/25] RUN mkdir -p /usr/lib/jvm/jdk17 && tar -xpzf /tmp/jdk17.tar.gz -C /usr/lib/jvm/jdk17 --strip-components=1:                                                                                               
0.295 tar: conf/security/policy/unlimited: Cannot change mode to rwxr-xr-x: Operation not permitted                                                                                                                 
0.295 tar: conf/security/policy/limited: Cannot change mode to rwxr-xr-x: Operation not permitted                                                                                                                   
0.295 tar: conf/security/policy: Cannot change mode to rwxr-xr-x: Operation not permitted                                                                                                                           
0.295 tar: conf/security: Cannot change mode to rwxr-xr-x: Operation not permitted                                                                                                                                  
0.295 tar: conf/sdp: Cannot change mode to rwxr-xr-x: Operation not permitted
0.296 tar: conf/management: Cannot change mode to rwxr-xr-x: Operation not permitted
0.296 tar: conf: Cannot change mode to rwxr-xr-x: Operation not permitted
0.305 tar: legal/java.base: Cannot change mode to rwxr-xr-x: Operation not permitted
1.052 tar: jmods: Cannot change mode to rwxr-xr-x: Operation not permitted
Haroon-Khel commented 2 months ago

Got the remaining nodes up and running by building them manually

https://ci.adoptium.net/computer/test-docker-ubuntu2404-ppc64le-1/ https://ci.adoptium.net/job/AQA_Test_Pipeline/242/console

https://ci.adoptium.net/computer/test-docker-ubuntu2404-armv7l-1/ https://ci.adoptium.net/job/AQA_Test_Pipeline/241/console

sxa commented 2 months ago

Unfortunately we don't have an easy way of provisioning a 24.04 on arm32 or ppc64le at the moment to identify whether this is specific to running in a container.

sxa commented 2 months ago

It looks like we're getting some interesting new errors in the sanity.openjdk Also in the extended run we're getting quite a lot of failures some of which are related to dpkg so are likely related to the new version (I'm quite surprised that the tests are doing things at that OS-specific level, but apparently they do!):

 [09:14:04.910] Running dpkg
 [09:14:06.077] Command [PID: 1034473]:
     dpkg -S /lib/x86_64-linux-gnu/libbrotlidec.so.1
 [09:14:06.077] Output:
     dpkg-query: no path found matching pattern /lib/x86_64-linux-gnu/libbrotlidec.so.1
 [09:14:06.077] Returned: 1

 [09:14:06.078] java.io.IOException: Command [dpkg, -S, /lib/x86_64-linux-gnu/libbrotlidec.so.1] exited with 1 code

Based on this I think it's worth getting a "real" ubuntu 24.04 provisioned somewhere and verify that nothing we're seeing is the result of using an earlier kernel version with a newer container.

Haroon-Khel commented 2 months ago

OSUOSL's openstack and azure dont seem to have a ubuntu2404 image (yet)

sxa commented 2 months ago

This is where having the ESXi server was useful :-)

sxa commented 2 months ago

Will run the aqa test pipeline on it shortly EDIT: https://ci.adoptium.net/job/AQA_Test_Pipeline/237/console

To keep the machines busy over the UK bank holiday weekend I've kicked that off again with all the other versions at https://ci.adoptium.net/job/AQA_Test_Pipeline/243/

sxa commented 2 months ago

Got the remaining nodes up and running by building them manually

https://ci.adoptium.net/job/AQA_Test_Pipeline/242/console (ppc64le) https://ci.adoptium.net/job/AQA_Test_Pipeline/241/console (arm32)

Noting that those seem to be failing after the tarball extraction had the same problem you had with the docker image creation

Haroon-Khel commented 2 months ago

Ive marked both machines offline so they dont affect the weekend nightlies

sxa commented 2 months ago

I've just ran the Dockerfile for s390x and ppc64le in an emulated docker on an arm64 Ubuntu 24.04 host and they both worked without any permission issues 🤔 So the docker Ubuntu images on those platforms are not fundamentally broken.

sxa commented 1 month ago

I'm doing some re-runs based on the "clean" (green) sanity.openjdk jobs that I was doing as part of https://github.com/adoptium/temurin-build/issues/3685#issuecomment-2075133310 (the x64 -cfi-vh row) with JDK22+35:

sxa commented 1 month ago

Ubuntu 2404 static docker container online in jenkins https://ci.adoptium.net/computer/test-docker-ubuntu2404-x64-1/

Will run the aqa test pipeline on it shortly

EDIT: https://ci.adoptium.net/job/AQA_Test_Pipeline/237/console

https://ci.adoptium.net/computer/test-docker-ubuntu2404-armv8-1/ https://ci.adoptium.net/job/AQA_Test_Pipeline/239/console https://ci.adoptium.net/computer/test-docker-ubuntu2404-s390x-1/ https://ci.adoptium.net/job/AQA_Test_Pipeline/240/console

On the basis of the reruns in the previous comment running clean I'm going to re-run these (including extra versions) with ADOPTOPENJDK_BRANCH=v1.0.1-release so that the material matches the product under test (SDK_RESOURCE=releases). These are being run with all versions: 17,8,22,21,11. The original JDK17 runs linked above took 29 hours (x64), 14h (aarch64) and 17 hours (s390x). so the first 2-3 versions will hopefully be mostly complete on Monday morning which will hopefully give enough of a result set that we can determine if there are likely to be any problems.

NOTE: The first 28 of the 46 s390x failures are the same as the 14 from aarch64 but run twice (Different variants?) The rest seem to be similar failures, but were not run on the aarch64 system. Looking in more detail it looks like they were excluded under a PR primarily intended for Win32

sxa commented 1 month ago

OSUOSL's openstack and azure dont seem to have a ubuntu2404 image (yet)

We should look at provisioning a 22.04 system at OSUOSL (Maybe for Aarch64 and ppc64le both at OSUOSL) and then upgrading it to 24.04 within the OS and see if that makes a difference

sxa commented 1 month ago

Deep dive into the tools/jpackage failures which are occurring on 24.04 but not on 22.04:

On both machines dpkg -S /lib/aarch64-linux-gnu/libmd.so.0 are failing, although on the 22.04 machine it gives a summary of [14:00:33.690] Required packages: [libc6, zlib1g] but on 24.04 it has `[14:00:38.972] Required packages: []

This is not considered a failure, but later on when it tries to build it's own appcategorytest package it fails on:

OK Further investigation - all but five of the dpkg commands run on the Ubuntu 22.04 machine also give a non-zero return code. The ones that succeed are:

jiekang commented 1 month ago

I've seen fixes to the jpackage tests that reference specific quirks with new versions of platforms like Fedora. Is this failure similar? Can we fix the test upstream?

sxa commented 1 month ago

I've seen fixes to the jpackage tests that reference specific quirks with new versions of platforms like Fedora. Is this failure similar? Can we fix the test upstream?

I would expect so, yes.

sxa commented 1 month ago

Something like dpkg -S <blah> || dpkg -S $(realpath <blah>) is possibly enough to work around this problem - realpath seems to be in the base Ubuntu container images so it likely safe. Having said that, this code may be within the jpackage tool itself from a quick search and so it may not be as simple as adding a construct like that into the test case...

jiekang commented 1 month ago

Okay, well at the least we can report this in a JBS bug upstream.

Haroon-Khel commented 1 month ago

OSUOSL's openstack and azure dont seem to have a ubuntu2404 image (yet)

Azure and OSUOSL now have ubuntu2404 images, so we can have actual vm (non static docker node) ubuntu 2404 machines up

Haroon-Khel commented 1 month ago

I've set up https://ci.adoptium.net/computer/test-osuosl-ubuntu2404-aarch64-1/ on osuosl's openstack, our first 'real' ubuntu2404 machine https://ci.adoptium.net/job/AQA_Test_Pipeline/263/console

Haroon-Khel commented 1 month ago

Test failures on https://ci.adoptium.net/computer/test-osuosl-ubuntu2404-aarch64-1/

JDK21 test failures

sanity openjdk

jdk_lang

TEST: java/lang/ProcessBuilder/JspawnhelperWarnings.java
TEST: java/lang/Thread/virtual/stress/GetStackTraceALotWhenPinned.java#id0
TEST: java/lang/Thread/virtual/JfrEvents.java
TEST: java/lang/Thread/virtual/TracePinnedThreads.java
TEST: java/lang/Thread/virtual/ThreadAPI.java#default

jdk_security1

TEST: java/security/Security/ConfigFileTest.java

jdk_util

TEST: jdk/internal/util/ReferencedKeyTest.java

jdk_foreign

TEST: java/foreign/TestLargeSegmentCopy.java

extended openjdk

hotspot_serviceability_jvmti

TEST: serviceability/jvmti/vthread/GetThreadState/GetThreadStateTest.java#default
TEST: serviceability/jvmti/vthread/GetThreadState/GetThreadStateTest.java#no-vmcontinuations
TEST: serviceability/jvmti/GetOwnedMonitorInfo/GetOwnedMonitorInfoTest.java
TEST: serviceability/jvmti/vthread/VThreadEventTest/VThreadEventTest.java

jvm_compiler

TEST: compiler/loopopts/TestRemixAddressExpressionsWithIrreducibleLoop.java
TEST: compiler/types/TestSubTypeCheckWithBottomArray.java#stress
TEST: compiler/types/TestSubTypeCheckWithBottomArray.java#Xbatch
TEST: compiler/types/TestSubTypeCheckWithBottomArray.java#Xcomp

jdk_other

TEST: com/sun/jndi/ldap/LdapSSLHandshakeFailureTest.java

jdk_net

TEST: java/net/httpclient/whitebox/SSLFlowDelegateTestDriver.java
TEST: java/net/httpclient/HttpInputStreamAvailableTest.java

jdk_security3

TEST: javax/net/ssl/TLSv13/EngineOutOfSeqCCS.java
TEST: jdk/security/logging/RecursiveEventHelper.java
TEST: sun/security/ssl/SSLEngineImpl/TestBadDNForPeerCA.java
TEST: sun/security/ssl/SSLEngineImpl/TestBadDNForPeerCA12.java

jdk_tools

TEST: tools/jpackage/linux/AppAboutUrlTest.java#id0
TEST: tools/jpackage/linux/AppCategoryTest.java
TEST: tools/jpackage/linux/LinuxBundleNameTest.java
TEST: tools/jpackage/linux/LinuxResourceTest.java
TEST: tools/jpackage/linux/jdk/jpackage/tests/UsrTreeTest.java
TEST: tools/jpackage/linux/MaintainerTest.java
TEST: tools/jpackage/linux/PackageDepsTest.java
TEST: tools/jpackage/linux/ReleaseTest.java#id0
TEST: tools/jpackage/linux/ShortcutHintTest.java#id0
TEST: tools/jpackage/share/AddLShortcutTest.java
TEST: tools/jpackage/share/AddLauncherTest.java#id1
TEST: tools/jpackage/share/AppImagePackageTest.java
TEST: tools/jpackage/share/AppContentTest.java
TEST: tools/jpackage/share/EmptyFolderPackageTest.java
TEST: tools/jpackage/share/LicenseTest.java#id1
TEST: tools/jpackage/share/FileAssociationsTest.java#id0
TEST: tools/jpackage/share/PerUserCfgTest.java
TEST: tools/jpackage/share/ServiceTest.java
TEST: tools/launcher/RunpathTest.java
TEST: tools/launcher/Settings.java

jdk_jfr

TEST: jdk/jfr/jcmd/TestJcmdDump.java

jdk_jdi

TEST: com/sun/jdi/FinalizerTest.java

JDK 17 test failures

sanity openjdk

jdk_lang

TEST: java/lang/invoke/MethodHandleProxies/Driver.java

jdk_security1

TEST: java/security/Security/ConfigFileTest.java

jdk_util

TEST: java/util/Locale/LanguageSubtagRegistryTest.java
TEST: java/util/Locale/LSRDataTest.java

extended openjdk

jdk_other

TEST: com/sun/jndi/ldap/LdapSSLHandshakeFailureTest.java

jdk_net

TEST: java/net/httpclient/HttpInputStreamAvailableTest.java

jdk_nio

TEST: java/nio/file/spi/SetDefaultProvider.java

jdk_security3

TEST: javax/net/ssl/DTLS/DTLSWontNegotiateV10.java
TEST: sun/security/ssl/SSLContextImpl/SSLContextDefault.java
TEST: jdk/security/logging/RecursiveEventHelper.java

jdk_tools

TEST: tools/jlink/plugins/CDSPluginTest.java
TEST: tools/jpackage/linux/AppAboutUrlTest.java#id0
TEST: tools/jpackage/linux/AppCategoryTest.java
TEST: tools/jpackage/linux/LinuxBundleNameTest.java
TEST: tools/jpackage/linux/LinuxResourceTest.java
TEST: tools/jpackage/linux/LinuxWeirdOutputDirTest.java
TEST: tools/jpackage/linux/jdk/jpackage/tests/UsrTreeTest.java
TEST: tools/jpackage/linux/MaintainerTest.java
TEST: tools/jpackage/linux/PackageDepsTest.java
TEST: tools/jpackage/linux/ReleaseTest.java
TEST: tools/jpackage/linux/ShortcutHintTest.java#id0
TEST: tools/jpackage/share/AppImagePackageTest.java
TEST: tools/jpackage/share/AddLauncherTest.java#id1
TEST: tools/jpackage/share/EmptyFolderPackageTest.java
TEST: tools/jpackage/share/LicenseTest.java#id1
TEST: tools/jpackage/share/FileAssociationsTest.java#id0
TEST: tools/launcher/Settings.java

jdk_jfr

TEST: jdk/jfr/event/gc/detailed/TestGCCPUTimeEvent.java#Parallel
TEST: jdk/jfr/event/gc/detailed/TestGCCPUTimeEvent.java#G1
TEST: jdk/jfr/event/gc/detailed/TestGCCPUTimeEvent.java#Serial
TEST: jdk/jfr/event/metadata/TestLookForUntestedEvents.java
TEST: jdk/jfr/event/runtime/TestActiveSettingEvent.java
Haroon-Khel commented 1 month ago

Set up a new x64 ubuntu2404 machine in azure https://ci.adoptium.net/computer/test-azure-ubuntu2404-x64-1/

https://ci.adoptium.net/job/AQA_Test_Pipeline/266/console

Haroon-Khel commented 3 weeks ago

test-azure-ubuntu2404-x64-1 jdk21 test failures

extended openjdk

hotspot_serviceability_jvmti:
TEST: serviceability/jvmti/vthread/GetThreadState/GetThreadStateTest.java#default
TEST: serviceability/jvmti/vthread/GetThreadState/GetThreadStateTest.java#no-vmcontinuations
TEST: serviceability/jvmti/GetOwnedMonitorInfo/GetOwnedMonitorInfoTest.java
TEST: serviceability/jvmti/vthread/VThreadEventTest/VThreadEventTest.java
jvm_compiler
TEST: compiler/loopopts/superword/TestLargeScaleAndStride.java#AlignVector
TEST: compiler/loopopts/TestRemixAddressExpressionsWithIrreducibleLoop.java
TEST: compiler/rangechecks/TestLargeScaleInLongRCOverflow.java
TEST: compiler/types/TestSubTypeCheckWithBottomArray.java#stress
TEST: compiler/types/TestSubTypeCheckWithBottomArray.java#Xbatch
TEST: compiler/types/TestSubTypeCheckWithBottomArray.java#Xcomp
jdk_other
com/sun/jndi/ldap/LdapSSLHandshakeFailureTest.java
jdk_net
TEST: java/net/httpclient/whitebox/SSLFlowDelegateTestDriver.java
TEST: java/net/httpclient/HttpInputStreamAvailableTest.java
jdk_security3
TEST: javax/net/ssl/TLSv13/EngineOutOfSeqCCS.java
TEST: sun/security/ssl/SSLEngineImpl/TestBadDNForPeerCA.java
TEST: sun/security/ssl/SSLEngineImpl/TestBadDNForPeerCA12.java
TEST: jdk/security/logging/RecursiveEventHelper.java

29 jdk_tools failures

jdk_jfr
jdk/jfr/jcmd/TestJcmdDump.java
jdk_jdi
com/sun/jdi/FinalizerTest.java

sanity openjdk

jdk_lang
TEST: java/lang/ProcessBuilder/JspawnhelperWarnings.java
TEST: java/lang/String/CompactString/NegativeSize.java
TEST: java/lang/Thread/virtual/stress/GetStackTraceALotWhenPinned.java#id0
TEST: java/lang/Thread/virtual/JfrEvents.java
TEST: java/lang/Thread/virtual/TracePinnedThreads.java
TEST: java/lang/Thread/virtual/ThreadAPI.java#default
jdk_security1
java/security/Security/ConfigFileTest.java
jdk_util
jdk/internal/util/ReferencedKeyTest.java
sxa commented 3 weeks ago

Ref the last comment it would be good to have a similar summary for the docker container run mentioned in https://github.com/adoptium/infrastructure/issues/3501#issuecomment-2090349452 before the jobs for those disappear :-)

Additional note, I've split out another issue - linked in the description - for covering the creation/update of dockerhost Ubuntu 24.04 systems so that this issue can be focused on getting Ubuntu 24.04 into a sufficiently good state that we can be confident that we can start supporting it as an official docker image.

Haroon-Khel commented 3 weeks ago

test-docker-ubuntu2404-x64-1 jdk 17 failures

sanity openjdk

jdk_lang
 java/lang/String/CompactString/NegativeSize.java.NegativeSize
 java/lang/invoke/MethodHandleProxies/Driver.java.Driver
jdk_security1
 java/security/Security/ConfigFileTest.java.ConfigFileTest
jdk_util
 java/util/Locale/LSRDataTest.java.LSRDataTest
 java/util/Locale/LanguageSubtagRegistryTest.java.LanguageSubtagRegistryTest

extended openjdk

jdk_net
 java/net/httpclient/HttpInputStreamAvailableTest.java.HttpInputStreamAvailableTest
jdk_nio
java/nio/file/spi/SetDefaultProvider.java.SetDefaultProvider
jdk_security3
 jdk/security/logging/RecursiveEventHelper.java.RecursiveEventHelper
 sun/security/ssl/SSLContextImpl/SSLContextDefault.java.SSLContextDefault
 javax/net/ssl/DTLS/DTLSWontNegotiateV10.java.DTLSWontNegotiateV10

jdk_tools 26 failures

jdk_jfr
 jdk/jfr/event/gc/detailed/TestGCCPUTimeEvent.java#G1.TestGCCPUTimeEvent_G1
 jdk/jfr/event/gc/detailed/TestGCCPUTimeEvent.java#Parallel.TestGCCPUTimeEvent_Parallel
 jdk/jfr/event/gc/detailed/TestGCCPUTimeEvent.java#Serial.TestGCCPUTimeEvent_Serial
 jdk/jfr/event/metadata/TestLookForUntestedEvents.java.TestLookForUntestedEvents
 jdk/jfr/event/runtime/TestActiveSettingEvent.java.TestActiveSettingEvent

To note, I suspect running some of these jobs with the v1.0.1-release would solve some of these test failures

Haroon-Khel commented 3 weeks ago

JDK17 aqa pipeline using v1.0.1-release branch

test-azure-ubuntu2404-x64-1

With the v1.0.1-release branch, we are only seeing failures in the extended openjdk tests

jdk_tools
Failed test cases: 
TEST: tools/jpackage/linux/AppCategoryTest.java
TEST: tools/jpackage/linux/AppAboutUrlTest.java#id0
TEST: tools/jpackage/linux/LinuxResourceTest.java
TEST: tools/jpackage/linux/LinuxBundleNameTest.java
TEST: tools/jpackage/linux/MaintainerTest.java
TEST: tools/jpackage/linux/PackageDepsTest.java
TEST: tools/jpackage/linux/jdk/jpackage/tests/UsrTreeTest.java
TEST: tools/jpackage/linux/ReleaseTest.java
TEST: tools/jpackage/linux/ShortcutHintTest.java#id0
TEST: tools/jpackage/share/jdk/jpackage/tests/BasicTest.java
TEST: tools/jpackage/share/jdk/jpackage/tests/VendorTest.java#id1
TEST: tools/jpackage/share/AppImagePackageTest.java
TEST: tools/jpackage/share/EmptyFolderPackageTest.java
TEST: tools/jpackage/share/AddLauncherTest.java#id1
TEST: tools/jpackage/share/InstallDirTest.java#id0
TEST: tools/jpackage/share/FileAssociationsTest.java#id0
TEST: tools/jpackage/share/IconTest.java
TEST: tools/jpackage/share/LicenseTest.java#id1
TEST: tools/jpackage/share/LicenseTest.java#id0
TEST: tools/jpackage/share/MultiLauncherTwoPhaseTest.java
TEST: tools/jpackage/share/SimplePackageTest.java
TEST: tools/jpackage/share/RuntimePackageTest.java#id0
TEST: tools/jpackage/share/MultiNameTwoPhaseTest.java
Test results: passed: 239; failed: 23
sxa commented 2 weeks ago

Raised issue referenced above for the jpackage failures which are definitely a result of a change on the Ubuntu side rendering the openjdk tests incompatible with 24.04 for now.

sxa commented 2 weeks ago

JDK17 aqa pipeline using v1.0.1-release branch

test-azure-ubuntu2404-x64-1 - https://ci.adoptium.net/job/AQA_Test_Pipeline/268/consolet test-docker-ubuntu2404-x64-1

Since the machine is now idle can we get some other pipelines kicked off with other versions. Since the goal here is to have this as an official docker image for all JDK versions we need to have confidence that we know if there are any issues being able to declare it as good (We have the issue for the jpackage failures, which are known, but we need to know if there is anything else that might show up.

Haroon-Khel commented 2 weeks ago

JDK 8 11 21 using the v1.0.1-release branch

test-azure-ubuntu2404-x64-1

test-docker-ubuntu2404-x64-1

test-docker-ubuntu2404-armv7-1

test-docker-ubuntu2404-armv8-1

test-docker-ubuntu2404-s390x-1

test-osuosl-ubuntu2404-aarch64-1

sxa commented 2 weeks ago

Noting that we also have https://github.com/adoptium/infrastructure/issues/3598 to migrate all of the Ubuntu 23.10 (which will out of out support next month) on riscv64 to the LTS Ubuntu 24.04 release