adoptium / infrastructure

This repo contains all information about machine maintenance.
Apache License 2.0
86 stars 102 forks source link

aarch64 mac jdk11/18 : jdk_nio consistently failed on test-macstadium-macos11-arm64-2 #2682

Open sophia-guo opened 2 years ago

sophia-guo commented 2 years ago

aarch64 mac jdk11 : jdk_nio consistently failed on test-macstadium-macos11-arm64-2

Any other details: https://trss.adoptium.net/deepHistory?testId=62d7d4f9250c3c428ca02918 https://github.com/adoptium/aqa-tests/issues/3868#issuecomment-1190217628

sophia-guo commented 2 years ago

@sxa @Haroon-Khel

sxa commented 2 years ago

@sophia-guo I guess that since we have a machine it passes on this is not critical for the release? Also can you add in a grinder re-run link into the description template please?

smlambert commented 2 years ago

Added link above, and here Rerun Failed target in Grinder on same machine

sophia-guo commented 2 years ago

@sxa not critical. Just a note might hit the same issue for other release version.

sxa commented 2 years ago

Agreed. Although I think it's best that we take that penalty for this cycle and re-run as Grinders if required given the limited number of systems we have just now.

sxa commented 1 week ago

Referenced machine is no longer present in the CI and has been replaced with orka machines - re-running on those:

Noting that it's not clear from this issue what the original failure was.

sxa commented 1 week ago

Also noting that https://ci.adoptium.net/job/Test_openjdk11_hs_extended.openjdk_aarch64_mac/ seems to be passing jdk_nio consistently.

sxa commented 5 days ago

Some more tests:

sxa commented 5 days ago

Upstream issue showing a similar error with other tests: https://bugs.openjdk.org/browse/JDK-8144003 As per that bug it has been raised with Apple - Issue ID FB15368430

Based on that issue, here are other affected tests:

Grinder with all three tests:

Numbers in brackets are from curl -s https://ci.adoptium.net/job/Grinder/11840/consoleText | egrep '_PASSED|_FAILED' | grep -v echo | cut -d' ' -f2 | uniq -c and shows when it's passing and failing in the run, so e.g. 11/1/188 means that the first 11 passed, one failed, then 188 passed. Apart frmo teh jdk24 one, most seem to have had a single failure towards the start of the run.

sxa commented 5 days ago

Noting that due to other problems - most recently https://github.com/adoptium/temurin-build/issues/4058 - we haven't had a jdk11u/mac/aarch64 build for a while and so there is a lack of reliable recent history of this execution (I couldn't easily track down anything in TRSS for it)

sxa commented 1 day ago

Based on grep -ir 'exec format error' Test*mac* on the jenkins server, this has been seen in:

All of them have java.net.SocketException: Resource busy (setsockopt failed) for BasicMulticast Test and java.net.SocketException: Exec format error (setsockopt failed) in the AdapterMulticasting one. The exception is for the last two (note - only 2/3 of the jdk23 ones) where BasicMulticast passed ok.