Closed andrew-m-leonard closed 3 years ago
6.3G /home/jenkins/workspace/Test_openjdk11_j9_extended.functional_ppc64_aix 11G /home/jenkins/workspace/Test_openjdk8_j9_sanity.functional_ppc64_aix
Culprit: /home/jenkins/workspace/Test_openjdk8_j9_sanity.functional_ppc64_aix/openjdk-tests/TKG/test_output_16086634784597/cmdLineTester_callsitedbgddrext_openj9_0
-rw------- 1 jenkins staff 9592217600 Dec 22 19:46 j9core.dmp
Potential duplicate of #1772 but we can leave open until we confirm
No this is separate and is down to a relatively small filesystem (~17Gb) and multiple core files being produced, but not related to the AWT library
Late last week, the test role was added back to this machine. See slack thread for details.
Over the weekend, we saw a slew of space-related failures occurring across all test types on this machine.
Example: https://trss.adoptopenjdk.net/output/build?id=603a43ce5730424dbc92c820
Caused by: hudson.plugins.git.GitException: Command "git init /home/jenkins/workspace/Test_openjdk8_j9_sanity.functional_ppc64_aix/openjdk-tests" returned status code 128:
stdout:
stderr: error: copy-fd: write returned: No space left on device
fatal: cannot copy '/opt/freeware/share/git-core/templates/description' to '/home/jenkins/workspace/Test_openjdk8_j9_sanity.functional_ppc64_aix/openjdk-tests/.git/description': No space left on device
Also, we see this earlier on in the test. One theory is that one test tried to clone openjdk-tests and the clone failed midway through, after eating up what little remaining free space there was.
> git rev-parse --is-inside-work-tree # timeout=10
ERROR: Workspace has a .git repository, but it appears to be corrupt.
hudson.plugins.git.GitException: Command "git rev-parse --is-inside-work-tree" returned status code 128:
stdout:
stderr: fatal: not a git repository (or any parent up to mount point /)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
root@p9-aix1-ojdk05:[/root]df -h
Filesystem Size Used Avail Use% Mounted on
/dev/hd4 4.0G 187M 3.9G 5% /
/dev/hd2 6.0G 4.6G 1.5G 77% /usr
/dev/hd9var 6.0G 1.6G 4.5G 27% /var
/dev/hd3 4.0G 4.0G 0 100% /tmp
/dev/hd1 24G 24G 0 100% /home
/dev/hd11admin 128M 380K 128M 1% /admin
/proc - - 0 - /proc
/dev/hd10opt 8.0G 2.0G 6.1G 25% /opt
/dev/livedump 256M 368K 256M 1% /var/adm/ras/livedump
/dev/lvBESC 2.0G 299M 1.8G 15% /var/opt/BESClient
/dev/fslv00 128M 128M 0 100% /audit
root@p9-aix1-ojdk05:[/tmp]rpm -qi bash
Name : bash
Version : 5.0.18
Release : 1
Architecture: ppc
Install Date: Fri Feb 19 10:46:25 2021
Group : System Environment/Shells
Size : 9387707
License : GPLv3+
Signature : (none)
Source RPM : bash-5.0.18-1.src.rpm
Build Date : Fri Sep 18 15:53:11 2020
Build Host : pokndd5.pok.stglabs.ibm.com
Packager : IBM AIX Toolbox <https://ibm.biz/AIXToolbox>
URL : http://www.gnu.org/software/bash
Bug URL : https://ibm.biz/aixoss_forum
Summary : The GNU Bourne Again shell (bash) version 5.0.18
Description :
The GNU Bourne Again shell (Bash) is a shell or command language
interpreter that is compatible with the Bourne shell (sh). Bash
incorporates useful features from the Korn shell (ksh) and the C shell
(csh). Most sh scripts can be run by bash without modification. This
package (bash) contains bash version 5.0.18.
There are 32bit and 64bit binary versions available for bash
In this release, process substitution is not completely working. The output of a command might not be redirected correctly when using <(cmd) or >(cmd). root@p9-aix1-ojdk05:[/tmp]ls -lt sh-np | head prw------- 1 jenkins staff 0 Feb 19 13:23 sh-np.BEGaaa prw------- 1 jenkins staff 0 Feb 19 13:23 sh-np.BEGaab prw------- 1 jenkins staff 0 Feb 19 13:23 sh-np.CRmMaa prw------- 1 jenkins staff 0 Feb 19 13:23 sh-np.CRmMab prw------- 1 jenkins staff 0 Feb 19 13:23 sh-np.DVmMaa prw------- 1 jenkins staff 0 Feb 19 13:23 sh-np.DVmMab prw------- 1 jenkins staff 0 Feb 19 13:23 sh-np.Ffr7aa prw------- 1 jenkins staff 0 Feb 19 13:23 sh-np.Ffr7ab prw------- 1 jenkins staff 0 Feb 19 13:23 sh-np.0uI7aa prw------- 1 jenkins staff 0 Feb 19 13:23 sh-np.0uI7ab root@p9-aix1-ojdk05:[/tmp]ls -ltr sh-np | head prw------- 1 jenkins staff 0 Feb 19 13:01 sh-np.fvlqab prw------- 1 jenkins staff 0 Feb 19 13:01 sh-np.fvlqaa prw------- 1 jenkins staff 0 Feb 19 13:01 sh-np.yiAaab prw------- 1 jenkins staff 0 Feb 19 13:01 sh-np.yiAaaa prw------- 1 jenkins staff 0 Feb 19 13:01 sh-np.yfaMab prw------- 1 jenkins staff 0 Feb 19 13:01 sh-np.yfaMaa prw------- 1 jenkins staff 0 Feb 19 13:01 sh-np.yeDaab prw------- 1 jenkins staff 0 Feb 19 13:01 sh-np.yeDaaa prw------- 1 jenkins staff 0 Feb 19 13:01 sh-np.ybl7ab prw------- 1 jenkins staff 0 Feb 19 13:01 sh-np.ybl7aa root@p9-aix1-ojdk05:[/tmp]
* And I see, before I could research the rest - someone else made space on the system.
root@p9-aix1-ojdk05:[/tmp]df -h Filesystem Size Used Avail Use% Mounted on /dev/hd4 4.0G 187M 3.9G 5% / /dev/hd2 6.0G 4.6G 1.5G 77% /usr /dev/hd9var 6.0G 1.6G 4.5G 27% /var /dev/hd3 4.0G 123M 3.9G 3% /tmp /dev/hd1 24G 24G 0 100% /home /dev/hd11admin 128M 380K 128M 1% /admin /proc - - 0 - /proc /dev/hd10opt 8.0G 2.0G 6.1G 25% /opt /dev/livedump 256M 368K 256M 1% /var/adm/ras/livedump /dev/lvBESC 2.0G 299M 1.8G 15% /var/opt/BESClient /dev/fslv00 128M 128M 0 100% /audit
Machine running out of disk space due to multiple cores being generated during execution of Test_openjdk17_j9_extended.functional_ppc64_aix
FYI @Haroon-Khel @aixtools if this isn't occurring on other machines we need to find out what the issue is on this machine - marking it offline for now
OK - looking in /home/jenkins/workspace - lots of directories with 0 MB, and then:
0 build-scripts
1 Grinder@tmp
1 Test_openjdk16_j9_extended.functional_ppc64_aix@tmp
1 Test_openjdk17_j9_extended.functional_ppc64_aix@tmp
1 workspaces.txt
24184 Test_openjdk17_j9_extended.functional_ppc64_aix
root@p9-aix1-ojdk05:[/home/jenkins/workspace]cd Test_openjdk17_j9_extended.functional_ppc64_aix
root@p9-aix1-ojdk05:[/home/jenkins/workspace/Test_openjdk17_j9_extended.functional_ppc64_aix]du -sm * | sort -n
3 functional_test_output.tar.gz
428 jvmtest
1233 openjdkbinary
22522 openjdk-tests
root@p9-aix1-ojdk05:[/home/jenkins/workspace/Test_openjdk17_j9_extended.functional_ppc64_aix/openjdk-tests]du -sm * | sort -n
1 LICENSE
1 NOTICE
1 README.md
1 SECURITY.md
1 TestConfig
1 Utils
1 autoGen.mk
1 buildenv
1 external
1 get.sh
1 jck
1 openjdk
1 system
2 doc
26 perf
137 functional
22292 TKG
root@p9-aix1-ojdk05:[/home/jenkins/workspace/Test_openjdk17_j9_extended.functional_ppc64_aix/openjdk-tests/TKG]du -sm * | sort -n
1 LICENSE
1 README.md
1 SECURITY.md
1 SHA.txt
1 autoGenEnv.mk
1 bin
1 clean.mk
1 compile.mk
1 envSettings.mk
1 featureSettings.mk
1 makeGen.mk
1 makefile
1 moveDmp.mk
1 openj9Settings.mk
1 playlist.xsd
1 resources
1 runtest.mk
1 scripts
1 settings.mk
1 src
1 testEnv.mk
1 utils.mk
5 lib
22287 output_16143343964904
root@p9-aix1-ojdk05:[/home/jenkins/workspace/Test_openjdk17_j9_extended.functional_ppc64_aix/openjdk-tests/TKG/output_16143343964904]du -sm * | sort -n
2 TestTargetResult
3330 threadMXBeanTimedParkTest_2
9475 threadMXBeanTestSuite2_2
9482 threadMXBeanTestSuite1_6
root@p9-aix1-ojdk05:[/home/jenkins/workspace/Test_openjdk17_j9_extended.functional_ppc64_aix/openjdk-tests/TKG/output_16143343964904]ls -l
total 1696
-rw------- 1 jenkins staff 1736704 Feb 26 11:37 TestTargetResult
drwx------ 2 jenkins staff 256 Feb 26 11:17 threadMXBeanTestSuite1_6
drwx------ 2 jenkins staff 256 Feb 26 11:30 threadMXBeanTestSuite2_2
drwx------ 2 jenkins staff 256 Feb 26 11:37 threadMXBeanTimedParkTest_2
root@p9-aix1-ojdk05:[/home/jenkins/workspace/Test_openjdk17_j9_extended.functional_ppc64_aix/openjdk-tests/TKG/output_16143343964904]cd threadMXBeanTimedParkTest_2
root@p9-aix1-ojdk05:[/home/jenkins/workspace/Test_openjdk17_j9_extended.functional_ppc64_aix/openjdk-tests/TKG/output_16143343964904/threadMXBeanTimedParkTest_2]du -sm * | sort -n
3330 core.20210226.113712.28639248.0001.dmp
root@p9-aix1-ojdk05:[/home/jenkins/workspace/Test_openjdk17_j9_extended.functional_ppc64_aix/openjdk-tests/TKG/output_16143343964904/threadMXBeanTimedParkTest_2]cd ..
root@p9-aix1-ojdk05:[/home/jenkins/workspace/Test_openjdk17_j9_extended.functional_ppc64_aix/openjdk-tests/TKG/output_16143343964904]ls -lR
.:
total 1696
-rw------- 1 jenkins staff 1736704 Feb 26 11:37 TestTargetResult
drwx------ 2 jenkins staff 256 Feb 26 11:17 threadMXBeanTestSuite1_6
drwx------ 2 jenkins staff 256 Feb 26 11:30 threadMXBeanTestSuite2_2
drwx------ 2 jenkins staff 256 Feb 26 11:37 threadMXBeanTimedParkTest_2
./threadMXBeanTestSuite1_6:
total 9708632
-rw------- 1 jenkins staff 438972 Feb 26 11:17 Snap.20210226.111640.31391816.0003.trc
-rw------- 1 jenkins staff 9934845663 Feb 26 11:17 core.20210226.111640.31391816.0001.dmp
-rw------- 1 jenkins staff 1115725 Feb 26 11:17 javacore.20210226.111640.31391816.0002.txt
-rw------- 1 jenkins staff 8580476 Feb 26 11:17 jitdump.20210226.111640.31391816.0005.dmp
./threadMXBeanTestSuite2_2:
total 9701616
-rw------- 1 jenkins staff 446780 Feb 26 11:30 Snap.20210226.112933.28639480.0003.trc
-rw------- 1 jenkins staff 9936288391 Feb 26 11:30 core.20210226.112933.28639480.0001.dmp
-rw------- 1 jenkins staff 1121857 Feb 26 11:30 javacore.20210226.112933.28639480.0002.txt
-rw------- 1 jenkins staff 202 Feb 26 11:30 jitdump.20210226.112933.28639480.0005.dmp
./threadMXBeanTimedParkTest_2:
total 3409180
-rw------- 1 jenkins staff 3494121472 Feb 26 11:37 core.20210226.113712.28639248.0001.dmp
I am copying the TKG directory - before removing it, so someone with understanding can look at the .dmp files
There's a certain irony in the fact that the only machine that can generate core files ran out of disk space when it runs the tests that generate them ;-)
There's a certain irony in the fact that the only machine that can generate core files ran out of disk space when it runs the tests that generate them ;-)
I think that is also known as Murphy's Law - in some variation or another.
Just ran a build run on this system - as the test-ibm-aix71-ppc64-{1,2} are unavailable. https://ci.adoptopenjdk.net/job/build-scripts/job/jobs/job/jdk/job/jdk-aix-ppc64-openj9/385/
@andrew-m-leonard - are you ok that we close this one - as no longer relevant (as in no longer occurring)?
yes good, thanks
test-osuosl-aix71-ppc64-1 out of disk space