Open Haroon-Khel opened 2 years ago
ERROR: Cannot delete workspace :Malformed input or input contains unmappable characters https://github.com/adoptium/infrastructure/issues/2630
ERROR: Cannot delete workspace :Unable to delete 'D:\jenkins\workspace\Test_openjdk11_hs_sanity.openjdk_x86-64_windows\openjdkbinary\j2sdk-image\lib\modules'. Tried 3 times (of a maximum of 3) waiting 0.1 sec between attempts.
Recent two run: https://ci.adoptopenjdk.net/job/Test_openjdk11_hs_sanity.openjdk_x86-64_windows/647/console https://ci.adoptopenjdk.net/job/Test_openjdk11_hs_sanity.openjdk_x86-64_windows/647/console
That directory is being used by leftover jcmd.exe processes
https://github.com/adoptium/infrastructure/issues/2635 is related. It is surprising to find that this is occurring on a different machine this time
sun/tools/jinfo/Basic.sh on the 2 linux ppc64le machines has been resolved, https://github.com/adoptium/infrastructure/issues/2625#issuecomment-1181699740
Stewart has added jcmd
to the list of process to kill https://ci.adoptopenjdk.net/view/Tooling/job/SXA-processCheck/, https://github.com/adoptium/infrastructure/issues/2635#issuecomment-1184611082
https://ci.adoptopenjdk.net/job/Test_openjdk11_hs_sanity.openjdk_x86-64_windows/648/console https://ci.adoptopenjdk.net/job/Test_openjdk11_hs_sanity.openjdk_x86-64_windows/647/console https://ci.adoptopenjdk.net/job/Test_openjdk11_hs_sanity.openjdk_x86-64_windows/646/console
[WS-CLEANUP] Deleting project workspace...
[WS-CLEANUP] Deferred wipeout is disabled by the job configuration...
ERROR: Cannot delete workspace :Unable to delete 'D:\jenkins\workspace\Test_openjdk11_hs_sanity.openjdk_x86-64_windows\openjdkbinary\j2sdk-image\lib\modules'. Tried 3 times (of a maximum of 3) waiting 0.1 sec between attempts.
[Pipeline] }
[Pipeline] // timeout
[Pipeline] echo
Exception: hudson.AbortException: Cannot delete workspace: Unable to delete 'D:\jenkins\workspace\Test_openjdk11_hs_sanity.openjdk_x86-64_windows\openjdkbinary\j2sdk-image\lib\modules'. Tried 3 times (of a maximum of 3) waiting 0.1 sec between attempts.
All three recent jobs are assigned to this machine and failed . all failed with running this specific machine
https://ci.adoptopenjdk.net/job/Test_openjdk11_hs_sanity.openjdk_x86-64_windows/648/console https://ci.adoptopenjdk.net/job/Test_openjdk11_hs_sanity.openjdk_x86-64_windows/647/console https://ci.adoptopenjdk.net/job/Test_openjdk11_hs_sanity.openjdk_x86-64_windows/646/console
[WS-CLEANUP] Deleting project workspace... [WS-CLEANUP] Deferred wipeout is disabled by the job configuration... ERROR: Cannot delete workspace :Unable to delete 'D:\jenkins\workspace\Test_openjdk11_hs_sanity.openjdk_x86-64_windows\openjdkbinary\j2sdk-image\lib\modules'. Tried 3 times (of a maximum of 3) waiting 0.1 sec between attempts. [Pipeline] } [Pipeline] // timeout [Pipeline] echo Exception: hudson.AbortException: Cannot delete workspace: Unable to delete 'D:\jenkins\workspace\Test_openjdk11_hs_sanity.openjdk_x86-64_windows\openjdkbinary\j2sdk-image\lib\modules'. Tried 3 times (of a maximum of 3) waiting 0.1 sec between attempts.
All three recent jobs are assigned to this machine and failed . all failed with running this specific machine
Fixed as per https://github.com/adoptium/infrastructure/issues/2209#issuecomment-1185341489
Machines that are still problematic:
Any fedora dockerstatic container. ref https://github.com/adoptium/infrastructure/issues/2631, any fedora container on https://ci.adoptopenjdk.net/computer/docker-packet-ubuntu2004-intel-1/ will pass ipv6 tests while those on https://ci.adoptopenjdk.net/computer/docker-packet-ubuntu2004-amd-1/ will fail them. The difference needs to be investigated. I cant get java/nio/file/Files/probeContentType/Basic.java to pass on any Fedora container, see https://github.com/adoptium/infrastructure/issues/2631#issuecomment-1185690200
test-osuosl-centos74-ppc64le-1 and -2 sun/tools/jinfo/Basic.sh now passes, but sun/security/pkcs11/fips/TestTLS12.java still fails. See https://github.com/adoptium/infrastructure/issues/2625#issuecomment-1181699740
test-azure-win2012r2-x64-3 and test-azure-win2019-x64-1 see https://github.com/adoptium/infrastructure/issues/2645#issuecomment-1177334175 Failures are intermittent, but more failures than passes.
If by Monday these issues are not resolved, I'll turn the jenkins nodes offline for the release
I was able to get java/nio/file/Files/probeContentType/Basic.java to pass on our fedora boxes, see https://github.com/adoptium/infrastructure/issues/2631#issuecomment-1188992683, however I have not solved the failing ipv6 tests on fedora containers hosted on https://ci.adoptopenjdk.net/computer/docker-packet-ubuntu2004-amd-1/.
And sun/security/pkcs11/fips/TestTLS12.java continues to fail on test-osuosl-centos74-ppc64le-1 and -2, see https://github.com/adoptium/infrastructure/issues/2625
I have temporarily turned offline the following nodes for this release
https://ci.adoptopenjdk.net/computer/test-docker-fedora34-x64-1/ https://ci.adoptopenjdk.net/computer/test-docker-fedora34-x64-2/ https://ci.adoptopenjdk.net/computer/test-docker-fedora36-x64-1/ https://ci.adoptopenjdk.net/computer/test-osuosl-centos74-ppc64le-1/ https://ci.adoptopenjdk.net/computer/test-osuosl-centos74-ppc64le-2/
https://ci.adoptopenjdk.net/computer/test-docker-fedora34-x64-1/ https://ci.adoptopenjdk.net/computer/test-docker-fedora34-x64-2/ https://ci.adoptopenjdk.net/computer/test-docker-fedora36-x64-1/ https://ci.adoptopenjdk.net/computer/test-osuosl-centos74-ppc64le-1/ https://ci.adoptopenjdk.net/computer/test-osuosl-centos74-ppc64le-2/
I've turned these machines back online
@Haroon-Khel Can you give a status update on the systems that were problematic - have they all now been resolved or is there still work to do here. Need to know whether it can be closed or whether it needs to move to October.
Since sun/security/pkcs11/fips/TestTLS12.java continues to fail on test-osuosl-centos74-ppc64le-1 and -2 this issue should be kept open
Ipv6 failures on new ppc64le machine https://github.com/adoptium/infrastructure/issues/2883 test-docker-ubuntu2204-ppc64le-1 test-docker-debian11-ppc64le-1 Could also affect: test-docker-ubuntu2204-ppc64le-2 test-docker-debian11-ppc64le-2 test-docker-debian11-ppc64le-3
https://github.com/adoptium/infrastructure/blob/952a0bddf784ddcae519661daf975f1abb693ec4/ansible/playbooks/AdoptOpenJDK_Unix_Playbook/roles/DockerStatic/tasks/main.yml#L6 has run on the machines during setup, annoyingly it isn't fixing the problem
https://github.com/adoptium/infrastructure/issues/2884 affects the same machines
ref https://github.com/adoptium/infrastructure/issues/2886
Taking test-docker-centos8-x64-2
Taking offline the following machines due to https://github.com/adoptium/infrastructure/issues/2884
test-docker-ubuntu2204-ppc64le-1 test-docker-debian11-ppc64le-1 test-docker-ubuntu2204-ppc64le-2 test-docker-debian11-ppc64le-2
test-docker-ubi8-x64-2 and test-docker-fedora35-x64-1 both offline ref https://github.com/adoptium/infrastructure/issues/2882
ref https://github.com/adoptium/infrastructure/issues/2885 test-ibmcloud-win2012r2-x64-1 offline
These need to be addressed https://github.com/adoptium/adoptium/issues/200#issuecomment-1402131107
test-docker-ubi8-x64-2 and test-docker-fedora35-x64-1 both offline ref https://github.com/adoptium/infrastructure/issues/2882
Closed off https://github.com/adoptium/infrastructure/issues/2882
A quick summary
https://ci.adoptium.net/computer/test-docker-centos8-x64-2/ is offline due to https://github.com/adoptium/infrastructure/issues/2886
https://ci.adoptium.net/computer/test-docker-ubuntu2204-ppc64le-1/ and https://ci.adoptium.net/computer/test-docker-ubuntu2204-ppc64le-2/ are offline due to failing ipv6 tests, https://github.com/adoptium/infrastructure/issues/2949 and https://github.com/adoptium/infrastructure/issues/2884
https://ci.adoptium.net/computer/test-docker-ubuntu2204-x64-2 is offline due to https://github.com/adoptium/infrastructure/issues/2894#issuecomment-1467953191
Ive kept https://ci.adoptium.net/computer/test-docker-ubi8-x64-1 and https://ci.adoptium.net/computer/test-docker-fedora35-x64-1 online as only one jdk_net test fails on both https://github.com/adoptium/infrastructure/issues/3010
https://ci.adoptium.net/computer/test-ibmcloud-win2012r2-x64-1/ is offline due to https://github.com/adoptium/infrastructure/issues/2885#issuecomment-1385912369
@Haroon-Khel There seems to be quite a few test-docker machines that are in jenkins but not live, for example https://ci.adoptium.net/manage/computer/test%2Ddocker%2Dfedora37%2Darmv8%2D1/ - should they be removed from jenkins now? That one in particular seems to be the latest Fedora version so I'm a little surprised if it has been removed.
Also we've been having some inconsistencies on test issues in https://github.com/adoptium/infrastructure/issues/2536 across different mac machines.
extended.perf dacapo-xalan-0
success varies depending on machine: https://github.com/adoptium/aqa-tests/issues/3122#issuecomment-1787036636
Summary of AQA triage on s390x jdk-21.0.1+12.1 https://github.com/temurin-compliance/temurin-compliance/issues/431#issuecomment-1810092968 (ongoing)
MiniMix_aot_5m_0, DBBLoadTest_5m_0, DBBLoadTest_5m_1 intermittently pass on all machines, but fail consistently on test-marist-sles12-s390x-2 and test-marist-sles15-s390x-2
java/foreign/TestLargeSegmentCopy.java from jdk_foreign fails on test-marist-rhel8-s390x-2, test-marist-rhel7-s390x-2, test-marist-sles15-s390x-2
The following sanity system tests fail intermittently on all machines, but seem to fail consistently on test-marist-sles15-s390x-2
TestJlmRemoteClassAuth_1
TestJlmRemoteClassAuth_0
TestJlmRemoteClassNoAuth_0
TestJlmRemoteClassNoAuth_1
TestJlmRemoteMemoryAuth_0
TestJlmRemoteMemoryAuth_1
TestJlmRemoteMemoryNoAuth_0
TestJlmRemoteMemoryNoAuth_1
TestJlmRemoteNotifierProxyAuth_0
TestJlmRemoteNotifierProxyAuth_1
TestJlmRemoteThreadAuth_0
TestJlmRemoteThreadAuth_1
TestJlmRemoteThreadNoAuth_0
TestJlmRemoteThreadNoAuth_1
NioLoadTest_5m_0
NioLoadTest_5m_1
The remaining failures below, from extended openjdk, are being run on all machines (grinders 8060 to 8067)
jdk_other_0 jdk_net_0 jdk_net_1 jdk_nio_0 jdk_nio_1 jdk_security3_0 jdk_security3_1 jdk_management_0 jdk_jmx_1 jdk_tools_0 jdk_tools_1 jdk_jfr_0 jdk_rmi_0 jdk_jdi_0
Grinder | Machine | Time | Status |
---|---|---|---|
8060 | test-ubuntu2004-1 | 18h32 | ✅ |
8061 | test-sles15-2 | ABORTED after 40h (jdk_security_x = 7h each). Rerun 8077 (Next line!) | |
8077 | test-sles15-2 | No jdk_security_x, 345 failed [*] | |
8062 | test-rhel7-2 | ABORTED after 40h (jdk_security_x = 7h each) Rerun 8078 (Next line!) | |
8078 | test-rhel7-2 | No jdk_security_x 340 failures [*] | |
8063 | test-ubuntu2204-1 | 28 hours | 14 failures (mostly timeouts) Re-run failed targets 13 failures inc. multicast |
8064 | docker-sles12-1 | 17h11 | 1 fail: com.sun.jdi.FinalizerTest (re-run jdk_jdi_0 - same) |
8065 | test-rhel8-2 | 15h25 | 2 failures both in java.net.HttpClient (Re-run jdk_net-0/1 - 1 fail UdpSocket |
8066 | test-sles12-2 | ABORTED after 40h (jdk_securty_x = 7h each) Rerun 8078 (Next line!) | |
8079 | test-sles12-2 | No jdk_security_x, 345 failures [*] | |
8067 | docker-sles15-1 | 17h09 | 1 fail: sun.security.ssl.SSLSocketImpl (Re-run jdk_security3_0) PASS |
[*] - the 340/345 failing tests Include many which are failing with something similar to this: Exception creating connection to: 148.100.74.92; nested exception is: java.net.NoRouteToHostException: No route to host
|
Data from the October CPU AQA triage can be found here: https://docs.google.com/spreadsheets/d/16vAQvYzL_-azDoD5OhQ6lObD3-suJwqKfjtABuWoIkc/edit#gid=1601438678
This has a summary sheet, and a sheet for JDK Version with a list of: suite failures, action taken, and if applicable, problematic machine and failure type.
This list should be used to help drive individual actions to improve test infrastructure and reduce the number of re-runs due to machine configuration related issues. The rows that have a 'Bad Machine' and 'Failure Type' listed should be investigated first.
There is also a list of 'To Investigate' topics in each JDK Version sheet that may not necessarily be machine configuration issues, but look promising to me to understand and resolve. When I get more cycles, I intend to open separate, individual issues for these in the appropriate repos.
JDK17
test-docker-ubuntu2004-armv8l-3
TEST: java/beans/PropertyChangeSupport/Test4682386.java
TEST: java/beans/PropertyEditor/TestFontClassJava.java
TEST: java/beans/PropertyEditor/TestFontClassValue.java
TEST: java/beans/XMLEncoder/javax_swing_DefaultCellEditor.java
TEST: java/beans/XMLEncoder/javax_swing_JTree.java
TEST: java/beans/XMLEncoder/Test4631471.java
TEST: java/beans/XMLEncoder/Test4903007.java
TEST: javax/imageio/plugins/shared/ImageWriterCompressionTest.java
Installed fontconfig
, rerunning https://ci.adoptium.net/view/Test_grinder/job/Grinder/8281/console. Passes ✅
Need to install fontconfig everywhere
test-docker-ubuntu2010-armv8l-2
TEST: javax/imageio/plugins/shared/ImageWriterCompressionTest.java
Unable to install fontconfig on Ubuntu 2010
Err:1 http://ports.ubuntu.com/ubuntu-ports groovy/main arm64 fonts-dejavu-core all 2.37-2
404 Not Found [IP: 185.125.190.39 80]
E: Failed to fetch http://ports.ubuntu.com/ubuntu-ports/pool/main/f/fonts-dejavu/fonts-dejavu-core_2.37-2_all.deb 404 Not Found [IP: 185.125.190.39 80]
E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?
...
root@93d2b4e13a22:~# apt-get update
Ign:1 http://ports.ubuntu.com/ubuntu-ports groovy InRelease
Ign:2 http://ports.ubuntu.com/ubuntu-ports groovy-updates InRelease
Ign:3 http://ports.ubuntu.com/ubuntu-ports groovy-backports InRelease
Ign:4 http://ports.ubuntu.com/ubuntu-ports groovy-security InRelease
Err:5 http://ports.ubuntu.com/ubuntu-ports groovy Release
404 Not Found [IP: 185.125.190.39 80]
Err:6 http://ports.ubuntu.com/ubuntu-ports groovy-updates Release
404 Not Found [IP: 185.125.190.39 80]
Err:7 http://ports.ubuntu.com/ubuntu-ports groovy-backports Release
404 Not Found [IP: 185.125.190.39 80]
Err:8 http://ports.ubuntu.com/ubuntu-ports groovy-security Release
404 Not Found [IP: 185.125.190.39 80]
Reading package lists... Done
E: The repository 'http://ports.ubuntu.com/ubuntu-ports groovy Release' no longer has a Release file.
N: Updating from such a repository can't be done securely, and is therefore disabled by default.
N: See apt-secure(8) manpage for repository creation and user configuration details.
E: The repository 'http://ports.ubuntu.com/ubuntu-ports groovy-updates Release' no longer has a Release file.
N: Updating from such a repository can't be done securely, and is therefore disabled by default.
N: See apt-secure(8) manpage for repository creation and user configuration details.
E: The repository 'http://ports.ubuntu.com/ubuntu-ports groovy-backports Release' no longer has a Release file.
N: Updating from such a repository can't be done securely, and is therefore disabled by default.
N: See apt-secure(8) manpage for repository creation and user configuration details.
E: The repository 'http://ports.ubuntu.com/ubuntu-ports groovy-security Release' no longer has a Release file.
N: Updating from such a repository can't be done securely, and is therefore disabled by default.
N: See apt-secure(8) manpage for repository creation and user configuration details.
Looks like repo is no longer there, likely due to Ubuntu 2010 being EOL Update: This machine has been replaced with https://ci.adoptium.net/computer/test-docker-ubuntu2310-armv8l-1/ AQA test pipeline running on this machine https://ci.adoptium.net/job/AQA_Test_Pipeline/202/console
test-docker-sles12-s390x-1
TEST: java/beans/PropertyChangeSupport/Test4682386.java
TEST: java/beans/PropertyEditor/TestFontClassJava.java
TEST: java/beans/PropertyEditor/TestFontClassValue.java
TEST: java/beans/XMLEncoder/javax_swing_DefaultCellEditor.java
TEST: java/beans/XMLEncoder/javax_swing_JTree.java
TEST: java/beans/XMLEncoder/Test4631471.java
TEST: java/beans/XMLEncoder/Test4903007.java
Installed fontconfig-devel
, rerunning https://ci.adoptium.net/view/Test_grinder/job/Grinder/8293/
test-marist-ubuntu2204-s390x-1
TEST: sun/management/jdp/JdpDefaultsTest.java
TEST: sun/management/jdp/JdpJmxRemoteDynamicPortTest.java
TEST: sun/management/jdp/JdpSpecificAddressTest.java
test-docker-fedora33-ppc64le-1
test-skytap-ubuntu2004-ppc64le-1
compiler/rtm/locking/TestRTMAbortThreshold.java
intermittently fails with
TEST RESULT: Failed. Execution failed: `main' threw exception: java.lang.RuntimeException: Actual abort ratio (1002) should lower or equal to specified (0).: expected that 1002 <= 0
Passed 2 out of 5 times.
JDK21
test-docker-centos8-x64-1
TEST: java/lang/ProcessHandle/InfoTest.java
TEST: java/lang/reflect/Proxy/ClassRestrictions.java
TEST: java/lang/runtime/SwitchBootstrapsTest.java
TEST: java/lang/ScopedValue/UnboundValueAfterOOME.java
TEST: java/lang/String/RegionMatches.java
TEST: java/lang/System/LoggerFinder/RecursiveLoading/PlatformRecursiveLoadingTest.java
TEST: java/lang/System/LoggerFinder/RecursiveLoading/RecursiveLoadingTest.java
TEST: java/lang/System/LoggerFinder/SignedLoggerFinderTest/SignedLoggerFinderTest.java
test-docker-debian11-ppc64le-2
https://ci.adoptium.net/view/Test_grinder/job/Grinder/8346/console
jdk_management_0
TEST: sun/management/jmxremote/bootstrap/CustomLauncherTest.java
TEST: sun/management/jmxremote/bootstrap/LocalManagementTest.java
jdk_tools_1
TEST: com/sun/tools/attach/BasicTests.java
TEST: com/sun/tools/attach/TempDirTest.java
TEST: sun/jvmstat/monitor/MonitoredVm/TestPollingInterval.java
TEST: sun/tools/jcmd/TestJcmdDefaults.java
TEST: sun/tools/jcmd/TestJcmdSanity.java
TEST: sun/tools/jinfo/JInfoTest.java
TEST: sun/tools/jps/TestJps.java
TEST: sun/tools/jps/TestJpsSanity.java
TEST: sun/tools/jstat/JStatInterval.java
TEST: tools/jlink/JLinkDedupTestBatchSizeOne.java
TEST: sun/jvmstat/monitor/MonitoredVm/MonitorVmStartTerminate.java
jdk_jfr_0
TEST: jdk/jfr/api/consumer/streaming/TestBaseRepositoryAfterStart.java
TEST: jdk/jfr/api/consumer/streaming/TestBaseRepositoryLastModified.java
test-docker-debian11-ppc64le-1
TEST: java/util/concurrent/LinkedTransferQueue/WhiteBox.java
TEST: jdk/internal/util/ArchTest.java
List of likely machine related tests which I'm giong to stop bumping between iterations so we can track based on their attachment to this issue:
RHEL/CentOS*:
AIX:
Linux/s390x:
List of test failures on JDK8/arm32 (at a minimum) including the perf
test suites which are failing in the containerised environments on the arm64 hosts, but are ok on the two physical ODROID machines:
https://github.com/adoptium/infrastructure/issues/3043
Stuff identified during April 2024 dry runs:
test-docker-fedora34-x64-1 and (newly created) test-docker-fedora34-x64-2 ref https://github.com/adoptium/infrastructure/issues/2631 JDK8
The following tests are failing on both -1 and -2. Links are for -2 java/nio/file/Files/probeContentType/Basic.java https://ci.adoptopenjdk.net/view/Test_grinder/job/Grinder/5133/console java/net/Inet6Address/B6206527.java.B6206527 https://ci.adoptopenjdk.net/view/Test_grinder/job/Grinder/5135/console java/net/ipv6tests/B6521014.java https://ci.adoptopenjdk.net/view/Test_grinder/job/Grinder/5136/console
test-osuosl-centos74-ppc64le-1/ and test-osuosl-centos74-ppc64le-2/ ref https://github.com/adoptium/infrastructure/issues/2625 JDK8
On test-osuosl-centos74-ppc64le-1
sun/security/pkcs11/fips/TestTLS12.java https://ci.adoptopenjdk.net/view/Test_grinder/job/Grinder/5092/console
sun/tools/jinfo/Basic.sh https://ci.adoptopenjdk.net/view/Test_grinder/job/Grinder/5094/console
On test-osuosl-centos74-ppc64le-2
sun/security/pkcs11/fips/TestTLS12.java https://ci.adoptopenjdk.net/view/Test_grinder/job/Grinder/5095/console
~sun/tools/jinfo/Basic.sh~ resolved https://ci.adoptopenjdk.net/view/Test_grinder/job/Grinder/5103/console
test-azure-win2012r2-x64-3 and test-azure-win2019-x64-1 ref https://github.com/adoptium/infrastructure/issues/2645 JDK11