Closed mattcolegate closed 7 years ago
To add some more info, the version downloaded was https://nodejs.org/dist/latest-v7.x/node-v7.7.3-linux-ppc64.tar.gz.
Running node -e 'console.log("hi")'
also segfaults.
Running node -e 'console.log("hi")' also segfaults.
Can you obtain a stack trace?
Ouch - this from a RHEL70 box (EDIT: Also occurs on 7.1)
#0 0x00003fffb7aaa4e0 in .__memset_power7 () from /lib64/power8/libc.so.6
#1 0x0000000010f69094 in ._ZN2v88internal18RegExpResultsCache5ClearEPNS0_10FixedArrayE ()
#2 0x0000000010cace24 in ._ZN2v88internal4Heap19MarkCompactPrologueEv ()
#3 0x0000000010cbf964 in ._ZN2v88internal4Heap11MarkCompactEv ()
#4 0x0000000010cca010 in ._ZN2v88internal4Heap24PerformGarbageCollectionENS0_16GarbageCollectorENS_15GCCallbackFlagsE ()
#5 0x0000000010cca4f8 in ._ZN2v88internal4Heap14CollectGarbageENS0_16GarbageCollectorENS0_23GarbageCollectionReasonEPKcNS_15GCCallbackFlagsE ()
#6 0x0000000010ccce30 in ._ZN2v88internal4Heap12ReserveSpaceEPNS0_4ListINS1_5ChunkENS0_25FreeStoreAllocationPolicyEEEPNS2_IPhS4_EE ()
#7 0x000000001104a598 in ._ZN2v88internal12Deserializer11DeserializeEPNS0_7IsolateE ()
#8 0x0000000010dcea9c in ._ZN2v88internal7Isolate4InitEPNS0_12DeserializerE
()
#9 0x00000000110560b4 in ._ZN2v88internal8Snapshot10InitializeEPNS0_7IsolateE
()
#10 0x000000001075cca8 in ._ZN2v87Isolate3NewERKNS0_12CreateParamsE ()
#11 0x00000000111f6f04 in ._ZN4node5StartEP9uv_loop_siPKPKciS5_ ()
#12 0x00000000111f650c in ._ZN4node5StartEiPPc ()
#13 0x000000001052b720 in .main ()
@sxa555 Can you post the output of disassemble
and info registers
?
FYI I've built 7.7.3 from commit 9c68a69 locally and it doesn't fail in the same way. This is with this version of gcc (have the build boxes been upgraded recently to 4.9? I haven't used that yet)
gcc (GCC) 4.8.2 20140120 (Red Hat 4.8.2-16)
@bnoordhuis (For reference this output was just running "node" without any parameters)
(gdb) bt
#0 0x00003fffb7aaa4e0 in .__memset_power7 () from /lib64/power8/libc.so.6
#1 0x0000000010f69094 in ._ZN2v88internal18RegExpResultsCache5ClearEPNS0_10FixedArrayE ()
#2 0x0000000010cace24 in ._ZN2v88internal4Heap19MarkCompactPrologueEv ()
#3 0x0000000010cbf964 in ._ZN2v88internal4Heap11MarkCompactEv ()
#4 0x0000000010cca010 in ._ZN2v88internal4Heap24PerformGarbageCollectionENS0_16GarbageCollectorENS_15GCCallbackFlagsE ()
#5 0x0000000010cca4f8 in ._ZN2v88internal4Heap14CollectGarbageENS0_16GarbageCollectorENS0_23GarbageCollectionReasonEPKcNS_15GCCallbackFlagsE ()
#6 0x0000000010ccce30 in ._ZN2v88internal4Heap12ReserveSpaceEPNS0_4ListINS1_5ChunkENS0_25FreeStoreAllocationPolicyEEEPNS2_IPhS4_EE ()
#7 0x000000001104a598 in ._ZN2v88internal12Deserializer11DeserializeEPNS0_7IsolateE ()
#8 0x0000000010dcea9c in ._ZN2v88internal7Isolate4InitEPNS0_12DeserializerE ()
#9 0x00000000110560b4 in ._ZN2v88internal8Snapshot10InitializeEPNS0_7IsolateE ()
#10 0x000000001075cca8 in ._ZN2v87Isolate3NewERKNS0_12CreateParamsE ()
#11 0x00000000111f6f04 in ._ZN4node5StartEP9uv_loop_siPKPKciS5_ ()
#12 0x00000000111f650c in ._ZN4node5StartEiPPc ()
#13 0x000000001052b720 in .main ()
(gdb) info registers
r0 0x1 1
r1 0x3fffffffda00 70368744167936
r2 0x3fffb7be4410 70367531910160
r3 0xf 15
r4 0x0 0
r5 0x7ff 2047
r6 0x0 0
r7 0x30 48
r8 0x40 64
r9 0x400 1024
r10 0xf 15
r11 0x7 7
r12 0x800 2048
r13 0x3fffb7ffe190 70367536210320
r14 0x3fffffffe400 70368744170496
r15 0xffffffffba2e8ba3 18446744072538196899
r16 0x11fa4ab0 301615792
r17 0x7a940 502080
r18 0x7a940 502080
r19 0x0 0
r20 0x0 0
r21 0x1 1
r22 0x11f63a80 301349504
r23 0x2 2
r24 0x11f51e0e 301276686
r25 0x0 0
r26 0x0 0
r27 0x56f50 356176
r28 0x0 0
r29 0x11f63aa0 301349536
r30 0x11e1ce00 300011008
r31 0x3fffffffda00 70368744167936
pc 0x3fffb7aaa4e0 0x3fffb7aaa4e0 <.__memset_power7+64>
msr 0x800000010000d032 9223372041149796402
cr 0x44044841 1141131329
lr 0x10f69094 0x10f69094 <._ZN2v88internal18RegExpResultsCache5ClearEPNS0_10FixedArrayE+36>
ctr 0x3fffb7aaa4a0 70367530624160
xer 0x0 0
orig_r3 0xc00000000000908c -4611686018427350900
trap 0x300 768
(gdb) disassemble
Dump of assembler code for function .__memset_power7:
0x00003fffb7aaa4a0 <+0>: cmpldi cr7,r5,31
0x00003fffb7aaa4a4 <+4>: cmpldi cr6,r5,8
0x00003fffb7aaa4a8 <+8>: mr r10,r3
0x00003fffb7aaa4ac <+12>: rlwimi r4,r4,8,16,23
0x00003fffb7aaa4b0 <+16>: rlwimi r4,r4,16,0,15
0x00003fffb7aaa4b4 <+20>: ble cr6,0x3fffb7aaa830 <.__memset_power7+912>
0x00003fffb7aaa4b8 <+24>: neg r0,r3
0x00003fffb7aaa4bc <+28>: ble cr7,0x3fffb7aaa7a0 <.__memset_power7+768>
0x00003fffb7aaa4c0 <+32>: andi. r11,r10,7
0x00003fffb7aaa4c4 <+36>: rldimi r4,r4,32,0
0x00003fffb7aaa4c8 <+40>: mr r12,r5
0x00003fffb7aaa4cc <+44>: beq 0x3fffb7aaa500 <.__memset_power7+96>
0x00003fffb7aaa4d0 <+48>: clrldi r0,r0,61
0x00003fffb7aaa4d4 <+52>: mtocrf 1,r0
0x00003fffb7aaa4d8 <+56>: subf r5,r0,r5
0x00003fffb7aaa4dc <+60>: bns cr7,0x3fffb7aaa4e8 <.__memset_power7+72>
=> 0x00003fffb7aaa4e0 <+64>: stb r4,0(r10)
0x00003fffb7aaa4e4 <+68>: addi r10,r10,1
0x00003fffb7aaa4e8 <+72>: bne cr7,0x3fffb7aaa4f4 <.__memset_power7+84>
0x00003fffb7aaa4ec <+76>: sth r4,0(r10)
0x00003fffb7aaa4f0 <+80>: addi r10,r10,2
0x00003fffb7aaa4f4 <+84>: ble cr7,0x3fffb7aaa500 <.__memset_power7+96>
0x00003fffb7aaa4f8 <+88>: stw r4,0(r10)
0x00003fffb7aaa4fc <+92>: addi r10,r10,4
0x00003fffb7aaa500 <+96>: cmpldi cr5,r5,255
0x00003fffb7aaa504 <+100>: li r0,32
0x00003fffb7aaa508 <+104>: dcbtst 0,r10
0x00003fffb7aaa50c <+108>: cmpldi cr6,r4,0
0x00003fffb7aaa510 <+112>: rldicl r9,r5,61,3
0x00003fffb7aaa514 <+116>: crand 4*cr6+so,4*cr6+eq,4*cr5+gt
0x00003fffb7aaa518 <+120>: mtocrf 1,r9
0x00003fffb7aaa51c <+124>: bso cr6,0x3fffb7aaa5e0 <.__memset_power7+320>
0x00003fffb7aaa520 <+128>: rldicl r8,r5,59,5
0x00003fffb7aaa524 <+132>: clrldi r11,r5,61
0x00003fffb7aaa528 <+136>: cmpldi cr6,r11,0
0x00003fffb7aaa52c <+140>: cmpldi cr1,r9,4
0x00003fffb7aaa530 <+144>: mtctr r8
0x00003fffb7aaa534 <+148>: bne cr7,0x3fffb7aaa560 <.__memset_power7+192>
0x00003fffb7aaa538 <+152>: std r4,0(r10)
0x00003fffb7aaa53c <+156>: std r4,8(r10)
0x00003fffb7aaa540 <+160>: addi r10,r10,16
0x00003fffb7aaa544 <+164>: bns cr7,0x3fffb7aaa570 <.__memset_power7+208>
0x00003fffb7aaa548 <+168>: std r4,0(r10)
0x00003fffb7aaa54c <+172>: addi r10,r10,8
0x00003fffb7aaa550 <+176>: mr r12,r10
0x00003fffb7aaa554 <+180>: blt cr1,0x3fffb7aaa5b0 <.__memset_power7+272>
0x00003fffb7aaa558 <+184>: b 0x3fffb7aaa570 <.__memset_power7+208>
0x00003fffb7aaa55c <+188>: ori r2,r2,0
0x00003fffb7aaa560 <+192>: bns cr7,0x3fffb7aaa570 <.__memset_power7+208>
0x00003fffb7aaa564 <+196>: std r4,0(r10)
0x00003fffb7aaa568 <+200>: addi r10,r10,8
0x00003fffb7aaa56c <+204>: ori r2,r2,0
0x00003fffb7aaa570 <+208>: addi r12,r10,32
0x00003fffb7aaa574 <+212>: std r4,0(r10)
0x00003fffb7aaa578 <+216>: std r4,8(r10)
0x00003fffb7aaa57c <+220>: std r4,16(r10)
0x00003fffb7aaa580 <+224>: std r4,24(r10)
0x00003fffb7aaa584 <+228>: bdz 0x3fffb7aaa5b0 <.__memset_power7+272>
Looks like an almost-nullptr bug. It tries to store a byte at the address in r10, which is 0xf:
0x00003fffb7aaa4e0 <+64>: stb r4,0(r10) # r10 == 0xf
r3 (first function argument) is moved into r10 a few lines up so it would seem Heap::MarkCompactPrologue()
is calling RegExpResultsCache::Clear()
with a bad FixedArray pointer. That's about all I can glean from it though, the root cause is probably elsewhere.
The compiler version on the test/build boxes is:
root@test-osuosl-ubuntu14-ppc64-be-3:~# gcc --version
gcc (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4
Copyright (C) 2013 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
The binaries seem to run ok on the machines on which they were built:
iojs@test-osuosl-ubuntu14-ppc64-be-3:~/build/mtest/node-v7.7.3-nightly20170309c62798034a-linux-ppc64/bin$ uname -a
Linux test-osuosl-ubuntu14-ppc64-be-3 4.2.0-27-powerpc64-smp #32~14.04.1-Ubuntu SMP Fri Jan 22 15:47:25 UTC 2016 ppc64 ppc64 ppc64 GNU/Linux
iojs@test-osuosl-ubuntu14-ppc64-be-3:~/build/mtest/node-v7.7.3-nightly20170309c62798034a-linux-ppc64/bin$ node -e 'console.log("hi")';
hi
iojs@test-osuosl-ubuntu14-ppc64-be-3:~/build/mtest/node-v7.7.3-nightly20170309c62798034a-linux-ppc64/bin$ node --version
v7.7.3-nightly20170309c62798034a
iojs@test-osuosl-ubuntu14-ppc64-be-3:~/build/mtest/node-v7.7.3-nightly20170309c62798034a-linux-ppc64/bin$ node -e 'console.log("hi")';
hi
iojs@test-osuosl-ubuntu14-ppc64-be-3:~/build/mtest/node-v7.7.3-nightly20170309c62798034a-linux-ppc64/bin$
output just before crash with LD_DEBUG=all
23377: symbol=munmap; lookup in file=/lib64/power8/libc.so.6 [0]
23377: binding file ./node [0] to /lib64/power8/libc.so.6 [0]: normal symbol `munmap' [GLIBC_2.3]
23377: symbol=mprotect; lookup in file=./node [0]
23377: symbol=mprotect; lookup in file=/lib64/libdl.so.2 [0]
23377: symbol=mprotect; lookup in file=/lib64/power8/librt.so.1 [0]
23377: symbol=mprotect; lookup in file=/lib64/libstdc++.so.6 [0]
23377: symbol=mprotect; lookup in file=/lib64/power8/libm.so.6 [0]
23377: symbol=mprotect; lookup in file=/lib64/libgcc_s.so.1 [0]
23377: symbol=mprotect; lookup in file=/lib64/power8/libpthread.so.0 [0]
23377: symbol=mprotect; lookup in file=/lib64/power8/libc.so.6 [0]
23377: binding file ./node [0] to /lib64/power8/libc.so.6 [0]: normal symbol `mprotect' [GLIBC_2.3]
Segmentation fault (core dumped)
On RHEL machine:
-sh-4.2$ /lib64/power8/libc.so.6
GNU C Library (GNU libc) stable release version 2.17, by Roland McGrath et al.
Copyright (C) 2012 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.
Compiled by GNU CC version 4.8.3 20140911 (Red Hat 4.8.3-7).
Compiled on a Linux 3.10.0 system on 2015-01-19.
Available extensions:
The C stubs add-on version 2.1.2.
crypt add-on version 2.1 by Michael Glad and others
GNU Libidn by Simon Josefsson
Native POSIX Threads Library by Ulrich Drepper et al
BIND-8.2.3-T5B
RT using linux kernel aio
libc ABIs: UNIQUE IFUNC
For bug reporting instructions, please see:
<http://www.gnu.org/software/libc/bugs.html>.
On community machine
iojs@test-osuosl-ubuntu14-ppc64-be-3:~/build/mtest/node-v7.7.3-nightly20170309c62798034a-linux-ppc64/bin$ /lib64/libc.so.6
GNU C Library (Ubuntu EGLIBC 2.19-0ubuntu6.9) stable release version 2.19, by Roland McGrath et al.
Copyright (C) 2014 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.
Compiled by GNU CC version 4.8.4.
Compiled on a Linux 3.13.11 system on 2016-05-26.
Available extensions:
crypt add-on version 2.1 by Michael Glad and others
GNU Libidn by Simon Josefsson
Native POSIX Threads Library by Ulrich Drepper et al
BIND-8.2.3-T5B
libc ABIs: UNIQUE IFUNC
For bug reporting instructions, please see:
<https://bugs.launchpad.net/ubuntu/+source/eglibc/+bugs>.
I wonder if its the glibc version. I believe we had run the community binaries on our RHEL 7 machines in the past, but possibly node is now using something before that is not compatible across glibc versions 2.17 and 2.19
@sxa555 do you have an environment where you can install a newer glibc on RHEL 7 and see if that makes a difference ?
Updated title to indicate crash is only on RHEL 7 as binaries seem to run on on ubuntu 14 BE.
We're able to run v7.50 binaries, but not v7.6.0 and later:
-bash-4.2$ node-v7.5.0-linux-ppc64/bin/node
> .exit
-bash-4.2$ node-v7.6.0-linux-ppc64/bin/node
Segmentation fault (core dumped)
-bash-4.2$
The V8 5.4 -> 5.5 upgrade in 61870b4 seems like the most obvious culprit. Can you check if that commit fails and the preceding commit works?
For the 8.0.0 nightlies (from https://nodejs.org/download/nightly/):
-bash-4.2$ node-v8.0.0-nightly20170126a67a04d765-linux-ppc64/bin/node
> .exit
-bash-4.2$ node-v8.0.0-nightly20170127b19334e566-linux-ppc64/bin/node
Segmentation fault (core dumped)
-bash-4.2$
I guess this is also pointing to the V8 5.4->5.5 update:
-bash-4.2$ git log a67a04d765..b19334e566 --oneline
b19334e test: expand test coverage of fs.js
bee83e0 test: expand test coverage of events.js
e71c278 url: stop exporting originFor()
ad6e778 benchmark: add benchmark for object properties
084acc8 test: check noAssert option in buf.write*()
24ef1e6 string_decoder: align UTF-8 handling with V8
007386e repl: remove workaround for function redefinition
c2c6ae5 test: move test-vm-function-redefinition to parallel
b37f55a deps: limit regress/regress-crbug-514081 v8 test
91ab09f src: update NODE_MODULE_VERSION to 52
2739185 deps: update V8 to 5.5.372.40
-bash-4.2$
The V8 5.4 -> 5.5 upgrade in 61870b4 seems like the most obvious culprit. Can you check if that commit fails and the preceding commit works?
node works if we compile and run locally -- The failures are with the binaries from nodejs.org running locally.
The gcc version for the test-osuosl-ubuntu14-ppc64_be_1
machine (should be the same as the release machines):
test-osuosl-ubuntu14-ppc64-be-3:~$ gcc --version
gcc (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4
RHEL 7.2 suffers the same symptoms - it is supplied with gcc/g++ 4.8.5-4 - even later than the Ubuntu 14.04 one, which suggests it's either a compiler bug specific to Ubuntu''s specific gcc version (I'm thinking unlikely but not impossible) or more likely the new V8 is triggering something that is using some functionality from glibc later than 2.17 (RHEL7 has 2.17, Ubu14.04 has 2.19)
Going the other way round, node7 binaries built on RHEL7 appear to run ok on Ubuntu 14.04 - possibly because it's built against an earlier glibc) so I wonder if a CentOS build machine (same as x64?) might be a better choice than Ubuntu 14.04. For the record, we have built with Ubuntu 14.04.1 on PPC-LE (Note: the BE community machines are 14.04.5) and that appears to run ok on RHEL7.
(I've also tried building my own glibc 2.19 on RHEL7 but that didn't execute properly with anything on the system)
I've got my own clean Ubuntu 14.04 now that I can experiment with and replicates the (lack of) problem on that platform. That at least confirms it's not anything magic on the CI machines that's making it work ;-)
Have tried building my own gcc/g++ (version 4.8.5) and that still causes a crash in the same place when run on RHEL7, so whatever we're seeing isn't an issue specific to Ubuntu's compiler.
Updating glibc on the RHEL7 box is "non-trivial" so I can't really recommend such a course of action (needs the dynamic loader and other stuff updated)
It's the Clear function at the end of https://github.com/nodejs/node/blob/v7.x/deps/v8/src/regexp/jsregexp.cc that's causing the memset call, which is invoked from line 1472 of https://github.com/nodejs/node/blob/v7.x/deps/v8/src/heap/heap.cc.
For reference: I did also try using the headers from glibc 2.17 (the RHEL version) on the ubuntu 14.04 build system but that still seemed to cause a crash in the same place.
@bnoordhuis quick ping in case you have any more suggestions, otherwise it looks like there's not much we can do here.
No other suggestions. I'll go ahead and close it out.
An interesting link that tracks abi compatibility between versions: https://abi-laboratory.pro/tracker/timeline/glibc/
incompatible changes between 2.17 and 2.18 are related to
2.19 shows as 100% compatible with 2.17
Its not obvious from the description of the crash that it is related to either of those two things,
Running
node --version
works on this platform and version, but runningnpm --version
causes a segmentation fault and core dump.cc/ @gibfahn who helped with initial diagnosis