Closed Haroon-Khel closed 3 months ago
Reply via email:
Iirc, EOSSP stands for end of Service Pack support, not end of support. Not sure if/when it became impossible to purchase AIX 7.1, but it is still supported beyond 30 April (might need an extended support contract). This has been a discussion I had recently at another project that was also concerned re: AIX 7.1 support status.
Our decision was to not update anything further for AIX 7.1, but leave available all AIX 7.1 based products – at least for the coming year (or maybe two).
If you want ALL 7.1 support to disappear – just come up with a plan (all 6 at once, 3 now, 3 later, etc.).
Regards,
Michael
https://www.ibm.com/support/pages/lifecycle/search/?q=AIX%20Standard%20Edition explicitly lists "End of support" as the end of this month - the same as the EoSSP date although I accept that IBM may wish to support a customer for longer if they pay them more money (although that option is not listed in the table)
@aixtools Regarding a plan: Yes, we should come up with one :-) I guess one important question is what the status of AIX 7.3 is - can we now get those deployed as well at OSUOSL as it would be good to have some testing on that platform as well.
We'll see how things go during this release cycle (Based on the end of support dates and some discussions with IBM we have chosen to switch the current LTS lines up to building on AIX 7.2, so if that goes without problems (And we don't get too much backlash from anyone) then my intention would be to replace all of the AIX 7.1 machines (although maybe leave a couple for a month or so just in case we need to go back).
So I'd definitely want to go with one extra 7.2 build machine, and two 7.3 test machines if feasible. Then we can decide what to do with the others :-) Sound reasonable?
AIX 7.3 is available. Perhaps start with one at TL0 and one at TL1.
There are currently some issues getting patches (I have have AIX 7.3 TL1 at SP0, and I would expect issues with SP1). And I only have TL0 at SP1 iirc.
As to build on AIX 7.2 - I forget what TLSP the current AIX 7.2 build is at. But just as you were building on AIX 7.1 TL4 SP4 for a long time (aka base level), I need to know what AIX 7.2 base level you are considering.
Or think - build date.
I suggest rebuilding two of the AIX 7.1.5 systems to AIX 7.2 to get the expected 'build+test' chain more robust. And, considering that test-[,2] of AIX 7.2 are adopt[03,04] - lets use adopt[05,06].
I suggest rebuilding two of the AIX 7.1.5 systems to AIX 7.2 to get the expected 'build+test' chain more robust. And, considering that test-[,2] of AIX 7.2 are adopt[03,04] - lets use adopt[05,06].
Sounds good to me! For reference: adopt05 = test-osuosl-aix715-ppc64-1 adopt06 = test-osuosl-aix715-ppc64-2
So for now doing updating those two will still leave us with test machines 3 and 4 and the two build machines in case we need to revert.
We should probably look at updating adopt02 (build-osuosl-aix71-ppc64-2) as well, since I'd rather have two AIX 7.2 and one AIX 7.1 dedicated to build at the moment, but that can be after 05 and 06 are upgraded.
For AIX 7.3 would it make sense to try one of 05 or 06 as 7.3 instead of 7.2 so we can trial that version?
As to build on AIX 7.2 - I forget what TLSP the current AIX 7.2 build is at.
I think the ones we have labelled for build are a bit of a mix at the moment ... @sej-jackson @AdamBrousseau @backwaterred tagging you in for more input - any recommendations here for which AIX 7.2 TL/SP to build Temurin on? I know it was an issue with certain levels for 7.1, but are there any compatibility issues with earlier AIX 7.2 versions if we just pick the latest to build on?
I don't think we targeted a particular 7.2 level. We have a mix depending on the farm. 7200-05-03-2135 7200-05-04-2220 Likely the .3 just hasn't been updated to .4 as it should.
adopt02 has been reinstalled to support AIX 7.2 build (clone of adopt10, build-...-aix72-...-1
See adoptium/aqa-tests#3068 and adoptium/aqa-tests#3069
I don't think we targeted a particular 7.2 level.
Agreed. The issue that comes to mind for me is this one which means a minimum compiler version of xlc 16.1, and transitively a minimum os version. So, any 7.2 should be ok as far as I am aware.
Two of the aix715 test machines (-3 and -4) have now been reimaged with 72 as per https://github.com/adoptium/infrastructure/pull/3081
the last two aix715
systems have been recommissioned (as test-osuosl-aix72-ppc64-{5,6}. These need to be refreshed by the playbooks, jdk_boot setting verifications - and then relaunched vi ci.adoptium.net.
Noting that after https://github.com/adoptium/ci-jenkins-pipelines/pull/622 we were building on AIX 7.2 so the builds will not run on AIX 7.1.
If anyone needs an (insecure) version that will run on AIX 7.1 they can be retrieved here:
So the only outstanding AIX 7.1 system that we have is build-osuosl-aix71-ppc64-1 which is currently disabled but we can hold onto that for a short time longer in the (unlikely) event that we need to rebuild anything on it.
Or perhaps we should just take a mksysb
of it and/or redeploy as a WPAR on another host ...
"adopt02" (build-osuosl-aix72-ppc64-2) appears to be having an identity crisis.
https://ci.adoptium.net/job/Test_openjdk17_hs_extended.openjdk_ppc64_aix_testList_1/77/consoleText
On the one hand various tests seem to think they're running on "adopt02", whereas the /etc/hosts file mentions "adopt10".
[2023-06-25T14:35:31.240Z] stderr: [Exception in thread "main" java.lang.Error: Error on performing network operation
[2023-06-25T14:35:31.240Z] at compiler.compilercontrol.share.actions.BaseAction.communicate(BaseAction.java:105)
[2023-06-25T14:35:31.240Z] at compiler.compilercontrol.share.actions.BaseAction.main(BaseAction.java:59)
[2023-06-25T14:35:31.240Z] Caused by: java.net.UnknownHostException: adopt02: adopt02: Hostname and service name not provided or found
[2023-06-25T14:35:31.241Z] at java.base/java.net.InetAddress.getLocalHost(InetAddress.java:1671)
[2023-06-25T14:35:31.241Z] at compiler.compilercontrol.share.actions.BaseAction.communicate(BaseAction.java:90)
[2023-06-25T14:35:31.241Z] ... 1 more
Could this be connected to this machine's reprovisioning?
P.S. One potential source of "adopt02" on this machine can be found in the uname output:
uname : AIX adopt02 2 7 00FA74164C00 powerpc AIX
Have removed ci.role.test
from the build boxes but to prevent the hostname issue but the above will still want to be fixed on the machine :-)
Strange: adopt10
(from which the clone came) is the only system WITHOUT adopt10
root@osunim:[/root]dsh-adopt "grep adopt /etc/hosts"
adopt01:
10.1.0.7 adopt01
10.1.0.7 adopt01
10.1.0.8 adopt02
10.1.0.12 adopt03
10.1.0.16 adopt04 # test-osuosl-aix72-ppc64-2
10.1.0.4 adopt05
10.1.0.5 adopt06
10.1.0.182 adopt07
10.1.0.183 adopt08
10.1.0.176 adopt09
10.1.0.177 adopt10
140.211.9.10 p8-aix1-adopt01 # build-osuosl-ppc64-aix71-1
140.211.9.12 p8-aix1-adopt02 # build-osuosl-aix71-ppc64-2
140.211.9.28 p8-aix1-adopt03
140.211.9.36 p8-aix1-adopt04
140.211.9.99 p9-aix1-adopt05
140.211.9.100 p9-aix1-adopt06
140.211.9.168 p8-java1-adopt07
140.211.9.169 p8-java1-adopt08
140.211.9.163 p8-java1-adopt09
140.211.9.166 p8-java1-adopt10
2605:bc80:3010:104::8cd3:90a p8-aix1-adopt01
2605:bc80:3010:104::8cd3:90c p8-aix1-adopt02
2605:bc80:3010:104::8cd3:91c p8-aix1-adopt03
2605:bc80:3010:104::8cd3:924 p8-aix1-adopt04
2605:bc80:3010:104::8cd3:963 p9-aix1-adopt05
2605:bc80:3010:104::8cd3:964 p9-aix1-adopt06
2605:bc80:3010:104::8cd3:9a8 p8-java1-adopt07
2605:bc80:3010:104::8cd3:9a9 p8-java1-adopt08
2605:bc80:3010:104::8cd3:9a3 p8-java1-adopt09
2605:bc80:3010:104::8cd3:9a6 p8-java1-adopt10
140.211.9.10 build-osuosl-aix71-ppc64-1 build-osuosl-aix71-ppc64-1.adoptopenjdk.net
==============
adopt02:
10.1.0.177 adopt10 adopt10
140.211.9.166 adopt10
140.211.9.12 build-osuosl-aix72-ppc64-2 build-osuosl-aix72-ppc64-2.adoptopenjdk.net
140.211.9.12 build-osuosl-aix72-ppc64-2 build-osuosl-aix72-ppc64-2.adoptopenjdk.net
==============
adopt03:
10.1.0.7 adopt01
10.1.0.7 adopt01
10.1.0.8 adopt02
10.1.0.12 adopt03
10.1.0.16 adopt04 # test-osuosl-aix72-ppc64-2
10.1.0.4 adopt05
10.1.0.5 adopt06
10.1.0.182 adopt07
10.1.0.183 adopt08
10.1.0.176 adopt09
10.1.0.177 adopt10
140.211.9.10 p8-aix1-adopt01 # build-osuosl-ppc64-aix71-1
140.211.9.12 p8-aix1-adopt02 # build-osuosl-aix71-ppc64-2
140.211.9.28 test-osuosl-aix72-ppc64-1 test-osuosl-aix72-ppc64-1.adoptopenjdk.net
140.211.9.36 p8-aix1-adopt04
140.211.9.99 p9-aix1-adopt05
140.211.9.100 p9-aix1-adopt06
140.211.9.168 p8-java1-adopt07
140.211.9.169 p8-java1-adopt08
140.211.9.163 p8-java1-adopt09
140.211.9.166 p8-java1-adopt10
2605:bc80:3010:104::8cd3:90a p8-aix1-adopt01
2605:bc80:3010:104::8cd3:90c p8-aix1-adopt02
140.211.9.28 test-osuosl-aix72-ppc64-1 test-osuosl-aix72-ppc64-1.adoptopenjdk.net
2605:bc80:3010:104::8cd3:924 p8-aix1-adopt04
2605:bc80:3010:104::8cd3:963 p9-aix1-adopt05
2605:bc80:3010:104::8cd3:964 p9-aix1-adopt06
2605:bc80:3010:104::8cd3:9a8 p8-java1-adopt07
2605:bc80:3010:104::8cd3:9a9 p8-java1-adopt08
2605:bc80:3010:104::8cd3:9a3 p8-java1-adopt09
2605:bc80:3010:104::8cd3:9a6 p8-java1-adopt10
==============
adopt04:
10.1.0.7 adopt01
10.1.0.7 adopt01
10.1.0.8 adopt02
10.1.0.12 adopt03
10.1.0.16 adopt04 # test-osuosl-aix72-ppc64-2
10.1.0.4 adopt05
10.1.0.5 adopt06
10.1.0.182 adopt07
10.1.0.183 adopt08
10.1.0.176 adopt09
10.1.0.177 adopt10
140.211.9.10 p8-aix1-adopt01 # build-osuosl-ppc64-aix71-1
140.211.9.12 p8-aix1-adopt02 # build-osuosl-aix71-ppc64-2
140.211.9.28 p8-aix1-adopt03
140.211.9.36 test-osuosl-aix72-ppc64-2 test-osuosl-aix72-ppc64-2.adoptopenjdk.net
140.211.9.99 p9-aix1-adopt05
140.211.9.100 p9-aix1-adopt06
140.211.9.168 p8-java1-adopt07
140.211.9.169 p8-java1-adopt08
140.211.9.163 p8-java1-adopt09
140.211.9.166 p8-java1-adopt10
2605:bc80:3010:104::8cd3:90a p8-aix1-adopt01
2605:bc80:3010:104::8cd3:90c p8-aix1-adopt02
2605:bc80:3010:104::8cd3:91c p8-aix1-adopt03
140.211.9.36 test-osuosl-aix72-ppc64-2 test-osuosl-aix72-ppc64-2.adoptopenjdk.net
2605:bc80:3010:104::8cd3:963 p9-aix1-adopt05
2605:bc80:3010:104::8cd3:964 p9-aix1-adopt06
2605:bc80:3010:104::8cd3:9a8 p8-java1-adopt07
2605:bc80:3010:104::8cd3:9a9 p8-java1-adopt08
2605:bc80:3010:104::8cd3:9a3 p8-java1-adopt09
2605:bc80:3010:104::8cd3:9a6 p8-java1-adopt10
==============
adopt05:
10.1.0.177 adopt10 adopt10
140.211.9.166 adopt10
10.1.0.182 adopt07
10.1.0.4 adopt05
140.211.9.99 test-osuosl-aix72-ppc64-5 test-osuosl-aix72-ppc64-5.adoptopenjdk.net
==============
adopt06:
10.1.0.177 adopt10 adopt10
140.211.9.166 adopt10
10.1.0.182 adopt07
10.1.0.5 adopt06
140.211.9.100 adopt06
==============
adopt07:
10.1.0.177 adopt10 adopt10
140.211.9.166 adopt10
140.211.9.168 test-osuosl-aix72-ppc64-3 test-osuosl-aix72-ppc64-3.adoptopenjdk.net
140.211.9.168 test-osuosl-aix72-ppc64-3 test-osuosl-aix72-ppc64-3.adoptopenjdk.net
==============
adopt08:
10.1.0.177 adopt10 adopt10
140.211.9.166 adopt10
10.1.0.182 adopt07
140.211.9.169 test-osuosl-aix72-ppc64-4 test-osuosl-aix72-ppc64-4.adoptopenjdk.net
140.211.9.169 test-osuosl-aix72-ppc64-4 test-osuosl-aix72-ppc64-4.adoptopenjdk.net
==============
adopt10:
140.211.9.166 build-osuosl-aix72-ppc64-1 build-osuosl-aix72-ppc64-1.adoptopenjdk.net
140.211.9.166 build-osuosl-aix72-ppc64-1 build-osuosl-aix72-ppc64-1.adoptopenjdk.net
Further, I see adopt08 has an unusual address for adopt10
Also, the /etc/hosts file for adopt10
was recently eddited - so absolutely unclear why it no longer has it's own name in the /etc/hosts.
Looks like a lot of manual control on all the hosts.
For the record: the current last date changed:
root@osunim:[/root]dsh-adopt ls -l /etc/hosts
adopt01:
-rw-rw-r-- 1 root system 1338 Jun 15 23:07 /etc/hosts
==============
adopt02:
-rw-rw-r-- 1 root system 2182 Jun 15 23:07 /etc/hosts
==============
adopt03:
-rw-rw-r-- 1 root system 1344 May 31 17:58 /etc/hosts
==============
adopt04:
-rw-rw-r-- 1 root system 1344 Jun 01 10:12 /etc/hosts
==============
adopt05:
-rw-rw-r-- 1 root system 2132 Jun 22 23:06 /etc/hosts
==============
adopt06:
-rw-rw-r-- 1 root system 2072 Jun 20 11:21 /etc/hosts
==============
adopt07:
-rw-rw-r-- 1 root system 2180 Jun 15 23:07 /etc/hosts
==============
adopt08:
-rw-rw-r-- 1 root system 2199 Jun 8 23:06 /etc/hosts
==============
adopt10:
-rw-rw-r-- 1 root system 2135 Jun 15 23:07 /etc/hosts
Compare this with the last reboot (is approximate install time for the migrated systems).
root@osunim:[/root]dsh-adopt "last reboot | head -5"
adopt01:
reboot ~ Nov 02 17:02
reboot ~ Aug 01 22:44
reboot ~ Apr 07 07:38
reboot ~ Feb 19 15:17
reboot ~ Nov 23 10:49
==============
adopt02:
reboot ~ May 18 18:31
reboot ~ Oct 03 15:30
reboot ~ Jul 06 21:07
reboot ~ Jun 24 14:41
reboot ~ Jun 23 14:48
==============
adopt03:
reboot ~ Jun 24 09:05
reboot ~ Jan 27 10:54
reboot ~ Sep 22 12:34
reboot ~ Sep 21 12:16
reboot ~ Oct 12 08:11
==============
adopt04:
reboot ~ Apr 14 08:04
reboot ~ Mar 22 14:50
reboot ~ Mar 22 12:23
reboot ~ Mar 22 08:23
reboot ~ Mar 19 16:30
==============
adopt05:
reboot ~ Jun 20 10:56
reboot ~ May 30 17:45
reboot ~ Oct 03 15:30
reboot ~ Jul 06 21:07
reboot ~ Jun 24 14:41
==============
adopt06:
reboot ~ Jun 20 11:19
reboot ~ May 30 17:45
reboot ~ Oct 03 15:30
reboot ~ Jul 06 21:07
reboot ~ Jun 24 14:41
==============
adopt07:
reboot ~ May 30 17:45
reboot ~ Oct 03 15:30
reboot ~ Jul 06 21:07
reboot ~ Jun 24 14:41
reboot ~ Jun 23 14:48
==============
adopt08:
reboot ~ May 30 18:42
reboot ~ May 30 17:45
reboot ~ Oct 03 15:30
reboot ~ Jul 06 21:07
reboot ~ Jun 24 14:41
==============
adopt10:
reboot ~ Oct 03 15:30
reboot ~ Jul 06 21:07
reboot ~ Jun 24 14:41
reboot ~ Jun 23 14:48
reboot ~ Apr 24 11:20
I see that back in 2021 I made a central hosts file for all the hosts.
root@osunim:[/root]ls -l etc
total 8
-rw------- 1 root system 1254 Nov 25 2021 hosts.adopt
I'll update and redeploy to all the hosts.
Here is the diff
root@osunim:[/root/etc]diff -u *
--- hosts.adopt 2021-11-25 17:39:11.000000000 +0000
+++ hosts.adopt.2023 2023-06-28 07:59:05.505032615 +0000
@@ -1,15 +1,13 @@
127.0.0.1 loopback localhost
::1 loopback localhost
-82.161.237.226 bigfix.home.local # BigFix relay
10.1.0.22 nim.bak
-10.1.0.7 adopt01
#### Internal ####
10.1.0.7 adopt01
10.1.0.8 adopt02
10.1.0.12 adopt03
-10.1.0.16 adopt04 # test-osuosl-aix72-ppc64-2
+10.1.0.16 adopt04
10.1.0.4 adopt05
10.1.0.5 adopt06
10.1.0.182 adopt07
@@ -18,16 +16,16 @@
10.1.0.177 adopt10
#### External IPv4 ####
-140.211.9.10 p8-aix1-adopt01 # build-osuosl-ppc64-aix71-1
-140.211.9.12 p8-aix1-adopt02 # build-osuosl-aix71-ppc64-2
-140.211.9.28 p8-aix1-adopt03
-140.211.9.36 p8-aix1-adopt04
-140.211.9.99 p9-aix1-adopt05
-140.211.9.100 p9-aix1-adopt06
-140.211.9.168 p8-java1-adopt07
-140.211.9.169 p8-java1-adopt08
-140.211.9.163 p8-java1-adopt09
-140.211.9.166 p8-java1-adopt10
+140.211.9.10 p8-aix1-adopt01 # build-osuosl-aix71-ppc64-1
+140.211.9.12 p8-aix1-adopt02 # build-osuosl-aix72-ppc64-2
+140.211.9.28 p8-aix1-adopt03 # test-osuosl-aix72-ppc64-1
+140.211.9.36 p8-aix1-adopt04 # test-osuosl-aix72-ppc64-2
+140.211.9.99 p9-aix1-adopt05 # test-osuosl-aix72-ppc64-5
+140.211.9.100 p9-aix1-adopt06 # test-osuosl-aix72-ppc64-6
+140.211.9.168 p8-java1-adopt07 # test-osuosl-aix72-ppc64-3
+140.211.9.169 p8-java1-adopt08 # test-osuosl-aix72-ppc64-4
+140.211.9.163 p8-java1-adopt09 # TBD
+140.211.9.166 p8-java1-adopt10 # build-osuosl-aix72-ppc64-1
#### External IPv6 ####
2605:bc80:3010:104::8cd3:90a p8-aix1-adopt01
All /etc/hosts files are now synced. Please verify issue is now resolved.
So the only outstanding AIX 7.1 system that we have is build-osuosl-aix71-ppc64-1 which is currently disabled but we can hold onto that for a short time longer in the (unlikely) event that we need to rebuild anything on it.
Or perhaps we should just take a
mksysb
of it and/or redeploy as a WPAR on another host ...
@sxa : Made an mksysb image. Could (next week perhaps) re-install as AIX 7.3 TL1 SP1, and then @Haroon-Khel could run the playbook on it. There will likely be errors (xlc13 might not work, e.g.) - but I'll make a backup of the system so we can rinse and repeat easily.
All /etc/hosts files are now synced. Please verify issue is now resolved.
I've re-run one of the affected tests on the same machine. Passed first time. Thanks to @sxa and @aixtools :)
Multitudes of tests failing with
java.net.UnknownHostException: adopt06: adopt06: Hostname and service name not provided or found
at java.base/java.net.InetAddress.getLocalHost(InetAddress.java:1791)
at NoAddresses.main(NoAddresses.java:51)
at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104)
at java.base/java.lang.reflect.Method.invoke(Method.java:578)
at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:125)
at java.base/java.lang.Thread.run(Thread.java:1623)
Caused by: java.net.UnknownHostException: adopt06: Hostname and service name not provided or found
at java.base/java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
at java.base/java.net.Inet6AddressImpl.lookupAllHostAddr(Inet6AddressImpl.java:52)
at java.base/java.net.InetAddress$PlatformResolver.lookupByName(InetAddress.java:1061)
at java.base/java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1683)
at java.base/java.net.InetAddress$NameServiceAddresses.get(InetAddress.java:1004)
at java.base/java.net.InetAddress.getAllByName0(InetAddress.java:1673)
at java.base/java.net.InetAddress.getLocalHost(InetAddress.java:1786)
... 5 more
JavaTest Message: Test threw exception: java.net.UnknownHostException: adopt06: adopt06: Hostname and service name not provided or found
JavaTest Message: shutting down test
STATUS:Failed.`main' threw exception: java.net.UnknownHostException: adopt06: adopt06: Hostname and service name not provided or found
reported in https://github.com/adoptium/infrastructure/issues/3178 which as it currently stands will block releasing AIX JDK20 for the July release.
block releasing AIX JDK20 for the July release.
We don't ship JDK20 for AIX. Are we only seeing tests fail on that version? I'd be quite surprised if we didn't see similar for other versions on the same machine, although maybe we've got lucky and didn't hit that machine for those :-)
@aixtools Yep a reinstall of one of the machines with AIX 7.3 TL1 SP1 sounds good, especially since you have a mksysb now :-)
Also affecting JDK17, and will block the release for that, noting here: https://github.com/adoptium/aqa-tests/issues/4677#issuecomment-1642083117
I'll create a new issue for AIX 7.3, and overwrite the last AIX 7.1 system.
Raised https://github.com/adoptium/infrastructure/issues/3178 to call out the config issue with a subset of the new machines being brought online.
While this is done I'll note that we still have 7.1 machines configured in the temurin copmliance project which should be removed. I'll close this issue as the main ones in our build and test jenkins instance are complete and I am working on the related issue with local hostname resolution on the others under the issue mentioned above.
AIX 7.1 becomes out of support at the end of April https://github.com/adoptium/infrastructure/wiki/End-of-support-date-for-OS-distributions#aix
We have 6 7.1 machines which all need to be upgraded to 7.2, ideally before the end of April else they will likely be offline after April until they are upgraded
test-osuosl-aix715-ppc64-1 test-osuosl-aix715-ppc64-2 build-osuosl-aix71-ppc64-1 build-osuosl-aix71-ppc64-2 test-osuosl-aix715-ppc64-3 test-osuosl-aix715-ppc64-4
ping @aixtools @sxa