Closed smlambert closed 4 years ago
Currently sharing with one of the build boxes, but leaving this open since we could really do with more ...
Two new machines allocated. Various playbook modifications needed to stabilise them. Currently failing on a `git init operation:
Running on test-osuosl-ppc64-aix-71-1 in /home/jenkins/workspace/build-scripts/jobs/jdk8u/jdk8u-aix-ppc64-openj9
[Pipeline] {
[Pipeline] stage
[Pipeline] { (build)
[Pipeline] checkout
No credentials specified
Cloning the remote Git repository
ERROR: Error cloning remote repo 'origin'
hudson.plugins.git.GitException: Could not init /home/jenkins/workspace/build-scripts/jobs/jdk8u/jdk8u-aix-ppc64-openj9
at org.jenkinsci.plugins.gitclient.CliGitAPIImpl$5.execute(CliGitAPIImpl.java:882)
at org.jenkinsci.plugins.gitclient.CliGitAPIImpl$2.execute(CliGitAPIImpl.java:662)
at org.jenkinsci.plugins.gitclient.RemoteGitImpl$CommandInvocationHandler$GitCommandMasterToSlaveCallable.call(RemoteGitImpl.java:161)
at org.jenkinsci.plugins.gitclient.RemoteGitImpl$CommandInvocationHandler$GitCommandMasterToSlaveCallable.call(RemoteGitImpl.java:154)
at hudson.remoting.UserRequest.perform(UserRequest.java:212)
at hudson.remoting.UserRequest.perform(UserRequest.java:54)
at hudson.remoting.Request$2.run(Request.java:369)
at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:819)
Suppressed: hudson.remoting.Channel$CallSiteStackTrace: Remote call to test-osuosl-ppc64-aix-71-1
at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1743)
at hudson.remoting.UserRequest$ExceptionResponse.retrieve(UserRequest.java:357)
at hudson.remoting.Channel.call(Channel.java:957)
at org.jenkinsci.plugins.gitclient.RemoteGitImpl$CommandInvocationHandler.execute(RemoteGitImpl.java:146)
at sun.reflect.GeneratedMethodAccessor372.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.jenkinsci.plugins.gitclient.RemoteGitImpl$CommandInvocationHandler.invoke(RemoteGitImpl.java:132)
at com.sun.proxy.$Proxy98.execute(Unknown Source)
at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:1135)
at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1175)
at org.jenkinsci.plugins.workflow.steps.scm.SCMStep.checkout(SCMStep.java:124)
at org.jenkinsci.plugins.workflow.steps.scm.SCMStep$StepExecutionImpl.run(SCMStep.java:93)
at org.jenkinsci.plugins.workflow.steps.scm.SCMStep$StepExecutionImpl.run(SCMStep.java:80)
at org.jenkinsci.plugins.workflow.steps.SynchronousNonBlockingStepExecution.lambda$start$0(SynchronousNonBlockingStepExecution.java:47)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: hudson.plugins.git.GitException: Command "git init /home/jenkins/workspace/build-scripts/jobs/jdk8u/jdk8u-aix-ppc64-openj9" returned status code 255:
stdout:
stderr: exec(): 0509-036 Cannot load program git because of the following errors:
0509-150 Dependent module /usr/lib/libiconv.a(libiconv.so.2) could not be loaded.
0509-152 Member libiconv.so.2 is not found in archive
at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandIn(CliGitAPIImpl.java:2318)
at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandIn(CliGitAPIImpl.java:2248)
at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandIn(CliGitAPIImpl.java:2244)
at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommand(CliGitAPIImpl.java:1777)
at org.jenkinsci.plugins.gitclient.CliGitAPIImpl$5.execute(CliGitAPIImpl.java:880)
... 11 more
[Pipeline] }
[Pipeline] // stage
[Pipeline] }
[Pipeline] // node
[Pipeline] }
[Pipeline] // stage
[Pipeline] echo
Execution error: Error cloning remote repo 'origin'
[Pipeline] End of Pipeline
Finished: FAILURE
The previews error message is only happening when git is executed from java (in this case the jenkins agent process) because java is adding /usr/lib
into the LIBPATH
which is stopping git
from picking up the desired version of the library (the one in /opt/freeware/lib
)
The git installed on the new machine is 2.20. The version on the older machines is 2.8.1. I have copied the git rpm from other machine (from /var/cache/yum/AIX_Toolbox/packages
) and replaced the one on the new machine with the old version (rpm -e git; rpm -ivh /opt/.../git-2.8.1-1.aix6.1.ppc.rpm
) and I believe that will rectify the problem
Jenkins agent is having consistent issues when using AdoptOpenJDK 8u222 OpenJ9 build:
Running on test-osuosl-ppc64-aix-71-1 in /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-aix-ppc64-openj9
[Pipeline] {
[Pipeline] stage
[Pipeline] { (build)
[Pipeline] checkout
[Pipeline] }
[Pipeline] // stage
[Pipeline] }
[Pipeline] // node
[Pipeline] }
[Pipeline] // stage
[Pipeline] echo
Execution error: java.io.IOException: Unexpected termination of the channel
[Pipeline] End of Pipeline
Finished: FAILURE
I have switched it to use the 64-bit IBM java8 build from https://developer.ibm.com/javasdk/support/aix-download-service/ for now and it appears to be progressing ok. The above log was from https://ci.adoptopenjdk.net/job/build-scripts/job/jobs/job/jdk11u/job/jdk11u-aix-ppc64-openj9/301/console - 302 is being run with this version:
# ./java -version
java version "1.8.0_211"
Java(TM) SE Runtime Environment (build 8.0.5.37 - pap6480sr5fp37-20190618_01(SR5 FP37))
IBM J9 VM (build 2.9, JRE 1.8.0 AIX ppc64-64-Bit Compressed References 20190617_419755 (JIT enabled, AOT enabled)
OpenJ9 - 354b31d
OMR - 0437c69
IBM - 4972efe)
JCL - 20190606_01 based on Oracle jdk8u211-b25
#
wget
is hitting the same problem that git
had. Will add /opt/freeware/lib
to the start of the LIBPATH
in the aix.sh
build environment script in order to compensate.
Building at https://ci.adoptopenjdk.net/job/build-scripts/job/jobs/job/jdk11u/job/jdk11u-aix-ppc64-openj9/303/console - will let it progress overnight
The new machines are noticeably faster when running builds - about 30% even without running from a ramdisk. Once they are both fully set up and verified to work I may use the new ones for building and assign the two older ones to test
It's throwing an OutOfMemory error running the agent using the adoptopenjdk builds by default. Adjusting Advanced options in the machine definition's "Launch Method" section for the agent startup to have:
JavaPath: /usr/jdk8u222-b04/bin/java
JVM Options: -Xmx1024m
Also temporarily tried increasing rss
for the jenkins user to 32Gb (67108864) in /etc/security/limits
but it didn't help.
Current ulimit
values are as follows:
$ ulimit -a
core file size (blocks, -c) unlimited
data seg size (kbytes, -d) 131072
file size (blocks, -f) unlimited
max memory size (kbytes, -m) 32768
open files (-n) unlimited
pipe size (512 bytes, -p) 64
stack size (kbytes, -s) 32768
cpu time (seconds, -t) unlimited
max user processes (-u) unlimited
virtual memory (kbytes, -v) unlimited
I have switched back to using the IBM J9 VM for now (which is what we use on the two existing AIX systems). My understanding is that the OpenJ9 project is using an AdoptOpenJDK VM on some of their AIX machines (I've tried 8u181 and 8u222 with the same failures). @AdamBrousseau @jdekonin any idea what might be different on your systems that you have at OpenJ9 which are running the agent with a non-IBM JRE?
Hitting separate issues beyond that with the filesystem building jdk8u - need to verify if this is specific to the machine or not since jdk11u built ok on this machine.
## Starting jdk
find: 0652-019 The status on /home/jenkins/workspace/build-scripts/jobs/jdk8u/jdk8u-aix-ppc64-hotspot/workspace/build/src/build/aix-ppc64-normal-server-release/hotspot/dist/lib is not valid.
gmake[2]: *** No rule to make target '/home/jenkins/workspace/build-scripts/jobs/jdk8u/jdk8u-aix-ppc64-hotspot/workspace/build/src/build/aix-ppc64-normal-server-release/corba/dist/lib/classes.jar', needed by '/home/jenkins/workspace/build-scripts/jobs/jdk8u/jdk8u-aix-ppc64-hotspot/workspace/build/src/build/aix-ppc64-normal-server-release/jdk/classes/_the.CORBA.classes.imported'. Stop.
gmake[1]: *** [BuildJdk.gmk:51: import-only] Error 2
gmake: *** [/home/jenkins/workspace/build-scripts/jobs/jdk8u/jdk8u-aix-ppc64-hotspot/workspace/build/src//make/Main.gmk:117: jdk-only] Error 2
OpenJ9 build job on the new machine is throwing this failure:
gmake[4]: *** No rule to make target '/home/jenkins/workspace/build-scripts/jobs/jdk8u/jdk8u-aix-ppc64-openj9/workspace/build/src/build/aix-ppc64-normal-server-release/vm/compiler/../omr/compiler/p/codegen/OMRInstOpCode.cpp', needed by '/home/jenkins/workspace/build-scripts/jobs/jdk8u/jdk8u-aix-ppc64-openj9/workspace/build/src/build/aix-ppc64-normal-server-release/vm/compiler/../objs/omr/compiler/p/codegen/OMRInstOpCode.o'. Stop.
gmake[4]: *** Waiting for unfinished jobs....
gmake[3]: *** [makefile:69: default] Error 2
gmake[2]: *** [makefile:1078: j9jitlauncher] Error 2
gmake[1]: *** [/home/jenkins/workspace/build-scripts/jobs/jdk8u/jdk8u-aix-ppc64-openj9/workspace/build/src/closed/OpenJ9.gmk:439: build-j9] Error 2
Doesn't occur on the existing build machines. Now trying to reproduce but pinging @pshipton in case this has been seen elsewhere and may be a known transient iissue (Couldn't find a reference to it in any issue elsewhere)
Text::CSV
wasn't found by the test suite:
17:05:58 perl configure.pl
17:05:58 cd /home/jenkins/workspace/Test_openjdk8_j9_special.functional_ppc64_aix/openjdk-tests/TestConfig/scripts/testKitGen; \
17:05:58 perl testKitGen.pl --graphSpecs=aix_ppc-64_cmprssptrs --jdkVersion=8 --impl=openj9 --buildList=functional --iterations=1 --testFlag= ; \
17:05:58 cd /home/jenkins/workspace/Test_openjdk8_j9_special.functional_ppc64_aix/openjdk-tests/TestConfig;
17:06:00 Can't locate Text/CSV.pm in @INC (you may need to install the Text::CSV module) (@INC contains: ./makeGenTool /opt/freemarker/lib/perl5 /opt/freeware/lib/perl5/site_perl/5.28.1/ppc-aix-thread-multi /opt/freeware/lib/perl5/site_perl/5.28.1 /opt/freeware/lib/perl5/5.28.1/ppc-aix-thread-multi /opt/freeware/lib/perl5/5.28.1 /opt/freeware/lib/perl5/site_perl) at makeGenTool/parseFiles.pl line 27.
17:06:00 BEGIN failed--compilation aborted at makeGenTool/parseFiles.pl line 27.
17:06:00 Compilation failed in require at makeGenTool/mkgen.pl line 93.
17:06:00 Using projectRootDir: /home/jenkins/workspace/Test_openjdk8_j9_special.functional_ppc64_aix/openjdk-tests/TestConfig/scripts/testKitGen/../../..
17:06:00 Getting modes data from modes.xml and ottawa.csv...
17:06:00 gmake[1]: Leaving directory '/home/jenkins/workspace/Test_openjdk8_j9_special.functional_ppc64_aix/openjdk-tests/TestConfig'
17:06:00 makefile:39: count.mk: A file or directory in the path name does not exist.
17:06:00 gmake: *** No rule to make target 'count.mk'. Stop.
The module is under /opt/freeware/lib/perl51
but not /opt/freeware/lib/perl5/5.28.1
- will symlink it under the Text
directory in the place it's currently looking for now. and rerun:
Failing run: https://ci.adoptopenjdk.net/job/Test_openjdk8_j9_special.functional_ppc64_aix/10/console
Re-run: https://ci.adoptopenjdk.net/job/Test_openjdk8_j9_special.functional_ppc64_aix/11/console
Another odd random failure showing up next time round on the hotspot build (324)
Running ddrgen to generate j9ddr.dat and superset.dat
Blob written to file: ../j9ddr.dat
Superset written to file: ../superset.dat
## Starting corba
Compiling 6 files for BUILD_LOGUTIL
/home/jenkins/workspace/build-scripts/jobs/jdk8u/jdk8u-aix-ppc64-openj9/workspace/build/src/corba/src/share/classes/com/sun/tools/corba/se/logutil/IndentingPrintWriter.java:35: error: cannot access Object
public class IndentingPrintWriter extends PrintWriter {
^
class file for java.lang.Object not found
/home/jenkins/workspace/build-scripts/jobs/jdk8u/jdk8u-aix-ppc64-openj9/workspace/build/src/corba/src/share/classes/com/sun/tools/corba/se/logutil/IndentingPrintWriter.java:38: error: cannot find symbol
private String indentString = "" ;
^
symbol: class String
location: class IndentingPrintWriter
XML::Parser
CPAN module has also failed to install so I'll have to remove the test tags from the machine for now.
The original machines are using a version of perl installed under /usr
from the AIX perl.rte
package (version 5.10.1.250) as opposed to the 5.28.1 installed via an RPM.
The new machine also has /opt/freeware/bin
at the start of the PATH
which makes it pick up that version first ...
Now trying to remove all that and install Text::CSV
and XML::Parser
into the system perl
EDIT: Multiple linkage failures when I try that
Have removed the test
tag from the new machine, but also removed build
from build-osuosl-ppc64-aix-71-1 for now in order to leave the latter dedicated to test since we have two mostly working build machines.
This machine is still giving various build failures which I haven't yet been able to fully understand and diagnose.
I have another two AIX boxes from another source now available but aren't yet set up for our needs, but will be looking at getting them installed with a level suitable for the OpenJ9 folks as per #1006
AIX 7.1TL5SP5 at IBM PCC: b9s010a@p159a02.centers.ihost.com AIX 7.2 at OSUOSL: 140.211.9.36 There are some issues with the first AIX 7.2 system as per @smlambert's comments on slack which I will repeat here:
shelley.lambert 12:56 AM
I've removed the test tag from test-osuosl-ppc64-aix-71-1, as there seems to be a couple of issues remaining https://ci.adoptopenjdk.net/view/Test_grinder/job/Grinder/1649/console:
• building openjdk tests fail, looks like executing the tar command at https://github.com/AdoptOpenJDK/openjdk-tests/blob/master/openjdk/build.xml#L38
• then later when archiving results, tar fails to run (the version on that machine does not seem to recognize the z flag)
Ref: https://adoptopenjdk.slack.com/archives/C53GHCXL4/p1578531390044000?thread_ts=1578516768.037100
OK, p159a02.centers.ihost.com has been run through the AIX playbook (with a little hand-holding), and should hopefully be ready to try.
I've added the installp for aixtools.git
, and a wrapper for wget
(/opt/freeware/bin/wget_64_fix
), which should deal with the known issues with libiconv.a
, and I've deliberately renamed the /opt/freeware/bin/basename
symlink (to basename.freeware
) because it upsets xlc.
Let me know if you run into any problems, and I'll take a look on Monday.
The 2nd machine (140.211.9.36
) needs python and yum before I can even get started, so I'll get those sorted out on Monday too.
The default git
in the path on 129.33.196.210 (Second AIX71TL5SP5 system) was causing problems (/usr/bin/git
was symlinked to /opt/freeware/bin/git
). I have removed the rpm-installed git (rpm -e git
) and set the symlink to point to the installp one (ln -s /opt/bin/git /usr/bin/git
) to resolve, although it needed a further update as we have an outdated cacerts so git can't validate github.com's certificate
Re-testing a jdk_math
at https://ci.adoptopenjdk.net/view/Test_grinder/job/Grinder/1943/
And extended.system at https://ci.adoptopenjdk.net/view/Test_system/job/Test_openjdk11_hs_extended.system_ppc64_aix/8
https://ci.adoptopenjdk.net/view/Test_system/job/Test_openjdk11_hs_extended.system_ppc64_aix/6/
Also on the second AIX71TL5SP5 system I've had to add libiconv.so.2
to /usr/lib/libiconv.a
as follows. Without it jenkins was failing to run shell scripts properly e.g.
[Pipeline] sh
19:17:18 process apparently never started in /home/jenkins/workspace/Grinder@tmp/durable-3e045e9b
19:17:18 (running Jenkins temporarily with -Dorg.jenkinsci.plugins.durabletask.BourneShellScript.LAUNCH_DIAGNOSTICS=true might make the problem clearer)
[Pipeline] }
Steps as follows:
mkdir /tmp/mylibiconv
cd /tmp/mylibiconv
cp /usr/lib/libiconv.a .
ar -X32 x /opt/freeware/lib/libiconv.a libiconv.so.2
ar -X32 r ../libiconv.a libiconv.so.2
rm libiconv.so.2
ar -X64 x /opt/freeware/lib/libiconv.a libiconv.so.2
ar -X64 r libiconv.a libiconv.so.2
mv /usr/lib/libiconv.a /usr/lib/libiconv.a-DIST.$$ && mv libiconv.a /usr/lib/libiconv.a
140.211.9.36
is now done, but it was a bit of a struggle getting started because the python installed by yum.sh
(using a March 2019 yum_bundle.tar
- apparently the latest) couldn't load libintl.a
, so yum
itself wouldn't work.
Got around it by temporarily replacing both /usr/lib/libintl.a
and /opt/freeware/lib/libintl.a
with symlinks to /usr/opt/rpm/lib/libintl.a
to get yum working in order to run yum update
(manually - it wouldn't run from the playbook).
The updates refreshed /opt/freeware/lib/libintl.a
with a newer version and redirected /usr/lib/libintl.a
to link to it, but this broke yum again, so had to reset /usr/lib/libintl.a
back to link to /usr/opt/rpm/lib/libintl.a
.
As before, I've installed git
from aixtools
, added a wrapper for wget
, and hidden the /opt/freeware/bin/basename
symlink to stop xlc
getting upset.
Crossing fingers and hoping it'll be ok.
140.211.9.36 (test-osuosl-ppc64-aix-72-2)[https://ci.adoptopenjdk.net/computer/test-osuosl-ppc64-aix-72-2) was throwing java.lang.OutOfMemoryError: native memory exhausted
I've resovled int by set the advanced options on the jenkins machine definition to have this as the Prefix Start Agent Command
value:
export LDR_CNTRL=MAXDATA=0x80000000 &&
With the installation of xlc16 on build-osuosl-ppc64-aix-71-2
by me today we will move the JDK13+ builds onto there from the build-ibm-
systems which can therefore be reallocated 100% for testing.
So to be clear ... these are the machines we now have (in theory, subject to final verification) for testing AIX:
name | former name | IP | OS level |
---|---|---|---|
build-osuosl-aix71-ppc64-1 | build-osuosl-ppc64-aix-71-1 | 140.211.9.10 | 7100-04 |
test-ibm-ppc64-aix-71-1 | test-ibm-ppc64-aix-71-1 | 129.33.196.209 | 7100-05 |
test-ibm-ppc64-aix-71-2 | build-ibm-ppc64-aix-71-1 | 129.33.196.210 | 7100-05 |
test-osuosl-aix72-ppc64-1 | test-osuosl-ppc64-aix-72-1 | 140.211.9.28 | 7200-02 |
test-osuosl-aix72-ppc64-2 | test-osuosl-ppc64-aix-72-2 | 140.211.9.36 | 7200-02 |
The above name changes (made in jenkins) brings the machines in line with the entries in inventory.yml after this is merged
Similar to #133 , there is currently no machine available for running openjdk regression, system, external, perf, and functional tests against the AIX builds. (for reference on what tests are enabled and what tests are not due to not having available machines, please see https://docs.google.com/spreadsheets/d/1X4CCfvMoCgEavRbvejHrTvPnqj37MB-_C6LB6b8Akkc/edit?usp=sharing).