adoptium / infrastructure

This repo contains all information about machine maintenance.
Apache License 2.0
85 stars 101 forks source link

New Machine requirement UBI static docker containers #3586

Closed sxa closed 2 months ago

sxa commented 3 months ago

I need to request a new machine:

Please explain what this machine is needed for:

We ship and support ubi9-minimal docker containers with Temurin as official dockerhub images but currently do not perform any testing on UBI. There is currently one UBI8 image in jenkins but that is hosted on the skytap dockerhost machine which is offline due to insufficient credits being available to run it, so we are not running testing on any UBI versions just now. Since this is an official docker image which we shipped we are running quite a risk by not actively testing on it on a regular basis.

The initial support for UBI8 images was created in https://github.com/adoptium/infrastructure/issues/2532 - this should be expanded to cover ubi9 (minimal version if possible, and if not ensure we understand why) and enable ubi8 again.

Related (other OSs)

sxa commented 3 months ago

FYI @Haroon-Khel this is the issue I mentioned earlier today

sxa commented 3 months ago

I did a little but of work today to test on UBI in order to compare with the Amazon Linux results so I've got these Dockerfiles that work (but may need a little more "tidying" before we make them live):

Dockerfiles.tar.gz

Haroon-Khel commented 3 months ago

x64 ubi9 container up https://ci.adoptium.net/computer/test-docker-ubi9-x64-1/ https://ci.adoptium.net/job/AQA_Test_Pipeline/277/console (jdk21 v1.0.1-release branch)

Haroon-Khel commented 3 months ago

s390x and arm64 images cant install libnss3.so

1.962 No match for argument: libnss3.so
1.989 Error: Unable to find a match: libnss3.so

And even though the x64 image couldnt find libnss3.so, it didnt cause an error during building.

Haroon-Khel commented 3 months ago

https://ci.adoptium.net/computer/test-docker-ubi9-armv8l-1/ without libnss3.so

https://ci.adoptium.net/job/AQA_Test_Pipeline/278/console (jdk21 v1.0.1-release branch)

Haroon-Khel commented 3 months ago

https://ci.adoptium.net/computer/test-docker-ubi9-s390x-1/ without libnss3.so

https://ci.adoptium.net/job/AQA_Test_Pipeline/279/console (jdk21 v1.0.1-release branch)

Haroon-Khel commented 3 months ago

ppc64le ubi9 image finding it hard to install the last line of dependencies

 > [28/30] RUN dnf install -y git make gcc xorg-x11-server-Xvfb libXrender libXi libXtst fontconfig fakeroot procps-ng hostname diffutils:                                                                          
0.574 Updating Subscription Management repositories.                                                                                                                                                                
0.575 Unable to read consumer identity                                                                                                                                                                              
0.579 
0.579 This system is not registered with an entitlement server. You can use subscription-manager to register.
0.579 
10.85 CentOS Stream 9 - BaseOS                        393  B/s | 3.9 kB     00:10    
10.85 Errors during downloading metadata for repository 'baseos':
10.85   - Downloading successful, but checksum doesn't match. Calculated: 53823a7862af9565f586a1bca3e3e31582fd80f4429eda9a4bcb5bdf9369ecce281ef67aa6b356db4cb9653e210b487cdfdfc28718fe805a1f1ccf915d05e039(sha512)  Expected: 21b9e356f319100ed0373db6e66e1cfb275a907b82f0eb74ee18bcd25b95445a6c1a319dea65383bbc691375eab05838802b383a35fda74c1f49612c1c737fd1(sha512) 
10.86 Error: Failed to download metadata for repo 'baseos': Cannot download repomd.xml: Cannot download repodata/repomd.xml: All mirrors were tried
sxa commented 3 months ago

s390x and arm64 images cant install libnss3.so

1.962 No match for argument: libnss3.so
1.989 Error: Unable to find a match: libnss3.so

And even though the x64 image couldnt find libnss3.so, it didnt cause an error during building.

/usr/lib64/libnss3.so is supplied in the nss package so that entry in the dockerfile probably shouldn't be there. I'm a bit surprised that there is a difference in the same UBI version between architectures in how it handles this condition though, but 🤷🏻

Haroon-Khel commented 3 months ago

x64 ubi9 container up https://ci.adoptium.net/computer/test-docker-ubi9-x64-1/ https://ci.adoptium.net/job/AQA_Test_Pipeline/277/console (jdk21 v1.0.1-release branch)

All tests passed

https://ci.adoptium.net/computer/test-docker-ubi9-armv8l-1/ without libnss3.so

https://ci.adoptium.net/job/AQA_Test_Pipeline/278/console (jdk21 v1.0.1-release branch)

The only failure javax/imageio/plugins/png/ReadPngRGBImageWithTRNSChunk.java

https://ci.adoptium.net/computer/test-docker-ubi9-s390x-1/ without libnss3.so

https://ci.adoptium.net/job/AQA_Test_Pipeline/279/console (jdk21 v1.0.1-release branch)

Only failure jdk/jfr/jcmd/TestJcmdDump.java

Haroon-Khel commented 3 months ago

ppc64le ubi9 container is up https://ci.adoptium.net/computer/test-docker–ubi9-ppc64le-1/ . Getting this error when updating openssl-libs

Error: Transaction test error:
  file /usr/lib64/ossl-modules/fips.so from install of openssl-libs-1:3.2.2-1.el9.ppc64le conflicts with file from package openssl-fips-provider-3.0.7-2.el9.ppc64le

https://ci.adoptium.net/job/AQA_Test_Pipeline/288/console (jdk21 v1.0.1-release branch)

Haroon-Khel commented 3 months ago

https://ci.adoptium.net/job/AQA_Test_Pipeline/288/console (jdk21 v1.0.1-release branch)

https://ci.adoptium.net/computer/test-docker–ubi9-ppc64le-1/ ubi9 ppc64le jdk21 failures v1.0.1-release branch

compiler/rtm/locking/TestRTMSpinLoopCount.java
jdk/jfr/jcmd/TestJcmdDump.java
com/sun/jdi/FinalizerTest.java
Haroon-Khel commented 3 months ago

https://ci.adoptium.net/computer/test-docker-ubi8-ppc64le-1/ is up https://ci.adoptium.net/job/AQA_Test_Pipeline/289/console jdk21 v1.0.1-release

update: failing tests:

sxa commented 3 months ago

https://ci.adoptium.net/job/AQA_Test_Pipeline/288/console (jdk21 v1.0.1-release branch)

ubi9 ppc64le jdk21 failures v1.0.1-release branch

Can we try and get JDK17 and 8 kicked off over the weekend too? I'm expecting it to show the same failures as Amazon Linux 2023 but it would be good to be able to see if that's true in practice :-)

sxa commented 3 months ago

https://ci.adoptium.net/job/AQA_Test_Pipeline/288/console (jdk21 v1.0.1-release branch) ubi9 ppc64le jdk21 failures v1.0.1-release branch

compiler/rtm/locking/TestRTMSpinLoopCount.java

Interesting - don't think I've seen that one as a regular failure so it likelyt needs some investigating - I think the other two are known / intermittent.

Haroon-Khel commented 3 months ago

https://ci.adoptium.net/computer/test-docker-ubi8-x64-1/ is up https://ci.adoptium.net/job/AQA_Test_Pipeline/290/console (jdk21 v1.0.1-release branch)

Update: extended openjdk jdk/jfr/jcmd/TestJcmdDump.java only failed test

Haroon-Khel commented 3 months ago

Can we try and get JDK17 and 8 kicked off over the weekend too? I'm expecting it to show the same failures as Amazon Linux 2023 but it would be good to be able to see if that's true in practice :-)

Yep, I just want to get a single jdk version test pipeline done before running the rest over the weekend

Haroon-Khel commented 3 months ago

https://ci.adoptium.net/computer/test-docker-ubi8-armv8-1/ is up https://ci.adoptium.net/job/AQA_Test_Pipeline/291/console jdk21 v1.0.1-release branch

Update: Only 1 failing test jdk/jfr/jcmd/TestJcmdDump.java

Haroon-Khel commented 3 months ago

AQA test pipelines for UBI8 and 9 machines. JDK 8 11 and 17 v1.0.1-release branch

test-docker-ubi9-x64-1

test-docker-ubi9-s390x-1

test-docker-ubi9-armv8l-1

test-docker-ubi9-ppc64le-1

test-docker-ubi8-x64-1

test-docker-ubi8-armv8-1

test-docker-ubi8-ppc64le-1

sxa commented 3 months ago

For consideration in terms of reporting: https://github.com/adoptium/infrastructure/issues/3600 This issue is a good example as it has lots of runs so it would be good to think about how best to summarise the information.

Haroon-Khel commented 3 months ago

JDK8 11 17 v1.0.1-release branch failures only

test-docker-ubi9-x64-1

JDK8 extended openjdk java/net/Inet6Address/B6206527.java java/net/ipv6tests/B6521014.java

java/nio/file/Files/probeContentType/Basic.java

jdk_security3

java/rmi/activation/rmidViaInheritedChannel/InheritedChannelNotServerSocket.java java/rmi/activation/rmidViaInheritedChannel/RmidViaInheritedChannel.java

JDK8 sanity openjdk sun/security/krb5/auto/rcache_usemd5.sh

JDK11 extended system ConcurrentLoadTest_5m (usually intermittent)

JDK11 extended openjdk jdk_security3 java/time/test/java/time/format/TestUTCParse.java

JDK17 sanity system TestJlmRemoteMemoryAuth (usually intermittent)

JDK17 extended openjdk java/beans/XMLEncoder/java_awt_CardLayout.java

test-docker-ubi9-s390x-1

JDK11 extended openjdk jdk_security3 java/time/test/java/time/format/TestUTCParse.java

JDK17 extended perf renaissance-movie-lens (usually intermittent)

test-docker-ubi9-armv8l-1

JDK8 extended openjdk java/net/Inet6Address/B6206527.java java/net/ipv6tests/B6521014.java

java/nio/file/Files/probeContentType/Basic.java

sun/security/tools/keytool/autotest.sh

JDK11 extended openjdk java/time/test/java/time/format/TestUTCParse.java

test-docker-ubi9-ppc64le-1

JDK8 sanity openjdk sun/security/krb5/auto/rcache_usemd5.sh

JDK8 extended openjdk java/nio/file/Files/probeContentType/Basic.java jdk_security3

JDK11 extended openjdk jdk_security3 java/time/test/java/time/format/TestUTCParse.java

JDK17 extended openjdk compiler/rtm/locking/TestRTMAbortThreshold.java

test-docker-ubi8-x64-1

JDK8 sanity openjdk sun/security/krb5/auto/rcache_usemd5.sh

JDK8 extended openjdk java/net/Inet6Address/B6206527.java java/net/ipv6tests/B6521014.java

sun/security/pkcs11/fips/TestTLS12.java

JDK11 extended openjdk java/time/test/java/time/format/TestUTCParse.java

JDK17 extended openjdk java/beans/PropertyEditor/TestFontClassJava.java

test-docker-ubi8-armv8-1

JDK8 extended openjdk java/net/Inet6Address/B6206527.java java/net/ipv6tests/B6521014.java

JDk11 extended openjdk java/time/test/java/time/format/TestUTCParse.java

test-docker-ubi8-ppc64le-1

JDK8 sanity openjdk sun/security/krb5/auto/rcache_usemd5.sh

JDK8 extended openjdk java/net/Inet6Address/B6206527.java java/net/ipv6tests/B6521014.java sun/security/pkcs11/fips/TestTLS12.java

JDK11 extended openjdk java/time/test/java/time/format/TestUTCParse.java tools/jmod/hashes/HashesOrderTest.java jdk/jfr/event/runtime/TestThreadCpuTimeEvent.java

JDK17 extended openjdk compiler/rtm/locking/TestRTMAbortThreshold.java

Haroon-Khel commented 3 months ago

Grouping all of the common consistent failures:

Known issue https://github.com/adoptium/infrastructure/issues/2949 java/net/Inet6Address/B6206527.java java/net/ipv6tests/B6521014.java

Known issue https://github.com/adoptium/infrastructure/issues/2631 java/nio/file/Files/probeContentType/Basic.java

jdk_security3

sun/security/krb5/auto/rcache_usemd5.sh sxa: Related: https://github.com/adoptium/infrastructure/issues/3521, https://github.com/adoptium/aqa-tests/issues/5300#issuecomment-2115874523, and referenced for JDK11 in https://github.com/adoptium/aqa-tests/issues/3065

java/time/test/java/time/format/TestUTCParse.java sxa: Referenced in multiple triage issues such as https://github.com/adoptium/aqa-tests/issues/5232#issue-2248064207 - needs issue raising.

sun/security/tools/keytool/autotest.sh - sxa: Seems common on newer distributions

sxa commented 2 months ago

@Haroon-Khel Is there any additional work needed here or do we have enough live UBI machines now?

Haroon-Khel commented 2 months ago

The machines are live, this can be closed