Closed targos closed 3 years ago
Example: https://ci.nodejs.org/job/node-test-commit-aix/36221/#showFailuresLink
Many tests using child process fail with the EAGAIN
error code.
/cc @nodejs/platform-aix
There's at least two issues here:
test-osuosl-aix72-ppc64_be-3
.test-osuosl-aix72-ppc64_be-2
, which we thought were addressed in https://github.com/nodejs/build/issues/2566 (cc @AshCripps)I've logged into test-osuosl-aix72-ppc64_be-3
and while Jenkins believes that host is idle I can see a lot of running processes.
e.g.
root@test-osuosl-aix72-ppc64_be-3:[/root]ps -ef | grep bash
iojs 11731220 15532534 0 Apr 12 - 0:00 /usr/bin/bash -xe /tmp/jenkins8012660964621729646.sh
iojs 13304150 15532534 0 Apr 08 - 0:00 /usr/bin/bash -xe /tmp/jenkins1832409300680795143.sh
iojs 14942628 15532534 0 Apr 12 - 0:00 /usr/bin/bash -xe /tmp/jenkins1431301206453705270.sh
iojs 15859984 15532534 0 Apr 12 - 0:00 /usr/bin/bash -xe /tmp/jenkins7930803442644380686.sh
iojs 16056804 15532534 0 Apr 12 - 0:00 /usr/bin/bash -xe /tmp/jenkins4968422098257556735.sh
iojs 16974290 15532534 0 Apr 11 - 0:00 /usr/bin/bash -xe /tmp/jenkins5270902707046841306.sh
root@test-osuosl-aix72-ppc64_be-3:[/root]ps -ef | grep gmake
iojs 5964088 13631954 0 Apr 12 - 0:00 gmake
iojs 11141604 15729004 0 Apr 08 - 0:00 gmake
iojs 12779806 11731220 0 Apr 12 - 0:00 gmake run-ci -j 6 JOBS=6
iojs 12976528 16974290 0 Apr 11 - 0:00 gmake run-ci -j 6 JOBS=6
iojs 13173104 5964088 0 Apr 12 - 0:41 gmake -C out BUILDTYPE=Release V=0
iojs 13631954 14942628 0 Apr 12 - 0:00 gmake run-ci -j 6 JOBS=6
root 14025110 13435388 0 01:00:20 pts/0 0:00 grep gmake
iojs 14090512 15139134 0 Apr 12 - 0:26 gmake -C out BUILDTYPE=Release V=0
iojs 14549464 16318966 0 Apr 12 - 0:00 gmake
iojs 15139134 12779806 0 Apr 12 - 0:00 gmake
iojs 15729004 13304150 0 Apr 08 - 0:00 gmake run-ci -j 6 JOBS=6
iojs 15925648 11141604 0 Apr 08 - 0:27 gmake -C out BUILDTYPE=Release V=0
iojs 15991290 12976528 0 Apr 11 - 0:00 gmake
iojs 16253262 15859984 0 Apr 12 - 0:00 gmake run-ci -j 6 JOBS=6
iojs 16318966 16056804 0 Apr 12 - 0:00 gmake run-ci -j 6 JOBS=6
iojs 16581100 16253262 0 Apr 12 - 0:00 gmake
iojs 17432864 14549464 0 Apr 12 - 0:39 gmake -C out BUILDTYPE=Release V=0
iojs 17563920 15991290 0 Apr 11 - 0:40 gmake -C out BUILDTYPE=Release V=0
iojs 18088318 16581100 0 Apr 12 - 0:49 gmake -C out BUILDTYPE=Release V=0
root@test-osuosl-aix72-ppc64_be-3:[/root]ps -ef | grep gmake | wc -l
18
root@test-osuosl-aix72-ppc64_be-3:[/root]ps -ef | grep g++ | wc -l
24
root@test-osuosl-aix72-ppc64_be-3:[/root]ps -ef | grep gcc | wc -l
12
root@test-osuosl-aix72-ppc64_be-3:[/root]
I've killed all the iojs
owned bash
processes on test-osuosl-aix72-ppc64_be-3
, which has terminated the child gmake
, gcc
/g++
processes as well. Restarted the Jenkins agent for good measure. I ran through the parallel
and sequential
tests using the current workspace and all of those tests passed:
iojs@test-osuosl-aix72-ppc64_be-3:[/home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64]./node --version
v16.0.0-pre
iojs@test-osuosl-aix72-ppc64_be-3:[/home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64]tools/test.py -J parallel
[02:14|% 100|+ 2758|- 0]: Done
iojs@test-osuosl-aix72-ppc64_be-3:[/home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64]tools/test.py -J sequential
[01:50|% 100|+ 120|- 0]: Done
iojs@test-osuosl-aix72-ppc64_be-3:[/home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64]
Started a CI build that's running on test-osuosl-aix72-ppc64_be-3
: https://ci.nodejs.org/job/node-test-commit-aix/36226/nodes=aix72-ppc64/
Started a CI build that's running on
test-osuosl-aix72-ppc64_be-3
: https://ci.nodejs.org/job/node-test-commit-aix/36226/nodes=aix72-ppc64/
That build and subsequent builds on test-osuosl-aix72-ppc64_be-3
have been passing 🎉: https://ci.nodejs.org/computer/test-osuosl-aix72-ppc64_be-3/builds
For test-osuosl-aix72-ppc64_be-2
the failing builds show:
Build timed out (after 10 minutes). Marking the build as failed.
The 10 minutes is something we've set in the job config:
I've looked at the jobs for the other platforms and we seem to be using a range of timeout values, e.g. on LinuxONE we use 5 mins (300 seconds), on arm64 macOS an hour (3600 seconds) and on the x64 Linux job 2 hours (7200 seconds). I'm going to bump the timeout for the AIX job to an hour.
Timeout has been increased to 1 hour: https://github.com/nodejs/jenkins-config-test/commit/1f025a50cebdc3d2075389f80b94128af256ba32
https://ci.nodejs.org/job/node-test-commit-aix/nodes=aix72-ppc64/36241/ has passed on test-osuosl-aix72-ppc64_be-2
. FWIW there was a 12 minute "no activity" period in the log that would have timed the build out with the previous 10 min timeout:
11:33:03 g++ -Wl,-bnoerrmsg -pthread -Wl,-bbigtoc -maix64 -Wl,-blibpath:/usr/lib:/lib:/opt/freeware/lib/pthread/ppc64 -Wl,-bE:/home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/mkcodecache.exp -Wl,-brtl -pthread -o /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/mkcodecache /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/mkcodecache/src/node_snapshot_stub.o /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/mkcodecache/src/node_code_cache_stub.o /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/mkcodecache/tools/code_cache/mkcodecache.o /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/mkcodecache/tools/code_cache/cache_builder.o /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/libnode.a /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/deps/histogram/libhistogram.a /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/deps/uvwasi/libuvwasi.a /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/tools/v8_gypfiles/libv8_snapshot.a /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/tools/v8_gypfiles/libv8_libplatform.a /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/tools/icu/libicui18n.a /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/deps/zlib/libzlib.a /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/deps/llhttp/libllhttp.a /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/deps/cares/libcares.a /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/deps/uv/libuv.a /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/deps/nghttp2/libnghttp2.a /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/deps/brotli/libbrotli.a /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/deps/openssl/libopenssl.a /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/deps/ngtcp2/libngtcp2.a /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/deps/ngtcp2/libnghttp3.a /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/tools/icu/libicuucx.a /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/tools/icu/libicudata.a /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/tools/v8_gypfiles/libv8_base_without_compiler.a /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/tools/v8_gypfiles/libv8_libbase.a /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/tools/v8_gypfiles/libv8_libsampler.a /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/tools/v8_gypfiles/libv8_zlib.a /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/tools/v8_gypfiles/libv8_compiler.a /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/tools/v8_gypfiles/libv8_initializers.a -lm -lperfstat -ldl -lrt
11:45:41 g++ -Wl,-bnoerrmsg -pthread -Wl,-bbigtoc -maix64 -Wl,-blibpath:/usr/lib:/lib:/opt/freeware/lib/pthread/ppc64 -Wl,-bE:/home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/node_mksnapshot.exp -Wl,-brtl -pthread -o /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/node_mksnapshot /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/node_mksnapshot/src/node_snapshot_stub.o /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/node_mksnapshot/src/node_code_cache_stub.o /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/node_mksnapshot/tools/snapshot/node_mksnapshot.o /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/node_mksnapshot/tools/snapshot/snapshot_builder.o /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/libnode.a /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/deps/histogram/libhistogram.a /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/deps/uvwasi/libuvwasi.a /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/tools/v8_gypfiles/libv8_snapshot.a /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/tools/v8_gypfiles/libv8_libplatform.a /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/tools/icu/libicui18n.a /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/deps/zlib/libzlib.a /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/deps/llhttp/libllhttp.a /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/deps/cares/libcares.a /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/deps/uv/libuv.a /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/deps/nghttp2/libnghttp2.a /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/deps/brotli/libbrotli.a /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/deps/openssl/libopenssl.a /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/deps/ngtcp2/libngtcp2.a /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/deps/ngtcp2/libnghttp3.a /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/tools/icu/libicuucx.a /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/tools/icu/libicudata.a /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/tools/v8_gypfiles/libv8_base_without_compiler.a /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/tools/v8_gypfiles/libv8_libbase.a /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/tools/v8_gypfiles/libv8_libsampler.a /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/tools/v8_gypfiles/libv8_zlib.a /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/tools/v8_gypfiles/libv8_compiler.a /home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/out/Release/obj.target/tools/v8_gypfiles/libv8_initializers.a -lm -lperfstat -ldl -lrt
The 21 most recent AIX 7.2 builds have all passed, so I think we can mark this as resolved. Feel free to reopen if symptoms reappear.
Thanks!
See https://ci.nodejs.org/job/node-test-commit-aix/buildTimeTrend Fails more than 50% of the time