Closed mhdawson closed 8 years ago
Here is the diff between build 194 (0 failures) and 195 (3 failures) https://github.com/nodejs/node/compare/112b9cdad78da77170436fb68b260f99772793ac...894203dae39c7f1f36fc6ba126bb5d782d79b744
re-run to see if they are consistent https://ci.nodejs.org/job/node-test-commit-aix/
Seem to fail consistently as failed in re-run
Seems to be related to malloc(0) behaviour. By default AIX fails malloc(0) and we've told it to have the linux malloc(0) behaviour. Not sure how but it looks like this change has somehow affected that.
Believe the failure in test-crypto-random is due to case where it does crypto.randomBytes(0) as this seems to fail with an out of memory.
@ofrobots does the inspector do anything to hook or intercept malloc ?
simplest test case to recreated:
-bash-4.3$ cat test.js var crypto = require('crypto'); crypto.randomBytes(0);
Compiled with _ALL_SOURCE
and _LINUX_SOURCE_COMPAT
test still fail.
https://ci.nodejs.org/job/node-test-commit-aix/nodes=aix61-ppc64/197/
@mhdawson I am not aware of any interception of malloc in v8_inspector, so this is curious. Let me know what you find out.
iWuzHere is also having some trouble compiling on our local machines (although builds ok at least on community CI) so this should be a fun one.
LMK how to help.
@ofrobots thanks for the offer, I think we need to gather some additional info about the compile failure (may be the compiler crashing) and I'm also trying a build with --without-inspector to see if that resolves the test failures. Once we have some more info I'll reach out.
Run here https://ci.nodejs.org/job/node-test-commit-aix-mdawson/9/ against PR that fixes --without-inspector to see if we still see failures if option is disabled, may help to focus on where to look if we don't see failures in that mode.
Tests seem to have passed when compiled with --without-inspector so we need to look at the diffs between that being enabled/not
Does something like this:
[ 'OS=="aix"', {
'include_dirs': [ 'config/aix' ],
'sources': [ 'config/aix/ares_config.h' ],
'defines': [
# Support for malloc(0)
'_LINUX_SOURCE_COMPAT=1',
'_ALL_SOURCE=1'],
}],
Need to be added to inspector's gyp?
Does running with inspector cause issues on any other platform like windows or solaris?
@iWuzHere If AIX requires all C/C++ code to be compiled with those flags you mentioned, that maybe that's the problem. It is worth trying it out.
Does running with inspector cause issues on any other platform like windows or solaris?
I haven't seen problems with this on Windows and on SmartOS, and neither has the CI. This does seem like something specific to AIX.
/fyi @eugeneo.
@iWuzHere could you try with this: https://github.com/eugeneo/node/commit/bf4cbfd64bde96fdf4ca9f33d65199d72f5b315b on an AIX machine?
launched run here:
Also trying to pare down where the issue might be I did a run with this change:
diff --git a/node.gyp b/node.gyp index 05a5530..11f696a 100644 --- a/node.gyp +++ b/node.gyp @@ -255,15 +255,6 @@ 'HAVE_INSPECTOR=1', 'V8_INSPECTOR_USE_STL=1', ], - 'sources': [ - 'src/inspector_agent.cc', - 'src/inspector_socket.cc', - 'src/inspector_socket.h', - 'src/inspector-agent.h', - ], - 'dependencies': [ - 'deps/v8_inspector/v8_inspector.gyp:v8_inspector', - ], 'include_dirs': [ 'deps/v8_inspector', 'deps/v8_inspector/deps/wtf', # temporary
And the node exe built with that seems to pass the minimal test case without OOM
So will be interested in run above, as it would seem to indicate that the other changes in node.gyp are not related.
Looked at the differences in the compile line between the good/bad runs for node_crypto.cc but don't see anything obvious. The differences are those expected based on the changes to the gyp file and -D_LINUX_SOURCE_COMPAT is there in both cases.
You dont just need -D_LINUX_SOURCE_COMPAT
but also -D_ALL_SOURCE
to get linux malloc behaviour as described in the stdlib.h file on AIX
#if defined(_ALL_SOURCE) && defined(_LINUX_SOURCE_COMPAT)
#ifdef _NO_PROTO
extern void *__linux_malloc();
extern void *__linux_realloc();
extern void *__linux_calloc();
extern void *__linux_valloc();
#else
extern void *__linux_malloc(size_t);
extern void *__linux_realloc(void *, size_t);
extern void *__linux_calloc(size_t, size_t);
extern void *__linux_valloc(size_t);
#endif
#define malloc __linux_malloc
#define calloc __linux_calloc
#define realloc __linux_realloc
#define valloc __linux_valloc
#endif /* _LINUX_SOURCE_COMPAT */
I have already done a run with that change included in node.gyp but the tests still failed, hence the suggestion to try and add it to the inspectors gyp
@iWuzHere Can you try this on AIX - https://github.com/eugeneo/node/commit/de97daf98de030fc697acd8bed627293caf32370 ?
@eugeneo I have already done it locally and waiting for the compile to finish :smile:
Still fail with changes to inspector.gyp
Alright so did quite a bit of investigation ( still not sure why this is happening ).
Compiled with -D_LINUX_SOURCE_COMPAT
and -D_ALL_SOURCE
added to node and inspector gyp
Test fails
imrani@dal-pacha /tmp/imranbuild/node ./out/Debug/node test
FATAL ERROR: node::RandomBytesRequest() Out of Memory
IOT/Abort trap (core dumped)
imrani@dal-pacha /tmp/imranbuild/node dump -Tv -Xany ./out/Debug/node | grep malloc
[58] 0x00000000 undef IMP DS EXTref libc.a(shr.o) malloc
[92] 0x00000000 undef IMP DS EXTref libc.a(shr.o) __linux_malloc
[260] 0x00000000 undef IMP RW EXTref libc.a(shr.o) __malloc_user_defined_name
[3967] 0x200658f0 .data EXP RW Ldef [noIMid] ares_malloc
[72217] 0x20132810 .data EXP DS Ldef [noIMid] uprv_malloc_57
[92580] 0x2016deb0 .data EXP DS Ldef [noIMid] CRYPTO_malloc_locked
[92582] 0x2016dec8 .data EXP DS Ldef [noIMid] CRYPTO_malloc
[92587] 0x2016df04 .data EXP DS Ldef [noIMid] CRYPTO_remalloc
[92597] 0x2016df7c .data EXP DS Ldef [noIMid] CRYPTO_dbg_malloc
[94250] 0x20172d70 .data EXP DS Ldef [noIMid] ares_malloc_data
[94342] 0x201731e4 .data EXP DS Ldef [noIMid] uv__malloc
Test passes
imrani@dal-pacha /tmp/imranbuild/node ./out/Debug/node test
imrani@dal-pacha /tmp/imranbuild/node dump -Tv -Xany ./out/Debug/node | grep malloc
[58] 0x00000000 undef IMP DS EXTref libc.a(shr.o) malloc
[91] 0x00000000 undef IMP DS EXTref libc.a(shr.o) __linux_malloc
[259] 0x00000000 undef IMP RW EXTref libc.a(shr.o) __malloc_user_defined_name
[3528] 0x20064080 .data EXP RW Ldef [noIMid] ares_malloc
[70891] 0x2012e748 .data EXP DS Ldef [noIMid] uprv_malloc_57
[75819] 0x2013cc5c .data EXP DS Ldef [noIMid] CRYPTO_malloc_locked
[75821] 0x2013cc74 .data EXP DS Ldef [noIMid] CRYPTO_malloc
[75826] 0x2013ccb0 .data EXP DS Ldef [noIMid] CRYPTO_remalloc
[75836] 0x2013cd28 .data EXP DS Ldef [noIMid] CRYPTO_dbg_malloc
[77489] 0x20141b1c .data EXP DS Ldef [noIMid] ares_malloc_data
[77581] 0x20141f90 .data EXP DS Ldef [noIMid] uv__malloc
Decided to step through code to see what was going on
Even though I break on malloc
I never hit it RandomBytesRequest
and the program fails as data_
is null
(gdb) break malloc
Breakpoint 2 at 0xd012a17c
(gdb) c
Continuing.
FATAL ERROR: node::RandomBytesRequest() Out of Memory
Program received signal SIGABRT, Aborted.
0xd050adf4 in pthread_kill () from /usr/lib/libpthreads.a(shr_xpg5.o)
(gdb) where
#0 0xd050adf4 in pthread_kill () from /usr/lib/libpthreads.a(shr_xpg5.o)
#1 0xd050a24c in _p_raise () from /usr/lib/libpthreads.a(shr_xpg5.o)
#2 0xd0137284 in raise () from /usr/lib/libc.a(shr.o)
#3 0xd01c0d68 in abort () from /usr/lib/libc.a(shr.o)
#4 0x10019e68 in node::OnFatalError (location=0x1233d0d8 "node::RandomBytesRequest()", message=0x1233d014 "Out of Memory") at ../src/node.cc:2395
#5 0x10019ecc in node::FatalError (location=0x1233d0d8 "node::RandomBytesRequest()", message=0x1233d014 "Out of Memory") at ../src/node.cc:2400
#6 0x10c09f6c in node::crypto::RandomBytesRequest::RandomBytesRequest (this=0x30a14298, env=0x309f9198, object=..., size=0) at ../src/node_crypto.cc:5380
#7 0x10c0ad24 in node::crypto::RandomBytes (args=0x2ff22044) at ../src/node_crypto.cc:5500
#8 0x115e97b8 in v8::internal::FunctionCallbackArguments::Call (this=0x2ff22178, f=<incomplete type>) at ../deps/v8/src/arguments.cc:33
#9 0x104274f4 in v8::internal::(anonymous namespace)::HandleApiCallHelper<false> (isolate=0x309cf228, args=...) at ../deps/v8/src/builtins.cc:3915
#10 0x10423db8 in v8::internal::Builtin_Impl_HandleApiCall (args=..., isolate=0x309cf228) at ../deps/v8/src/builtins.cc:3939
#11 0x10423cc0 in v8::internal::Builtin_HandleApiCall (args_length=3, args_object=0x2ff22328, isolate=0x309cf228) at ../deps/v8/src/builtins.cc:3936
#12 0xef70c74c in ?? ()
Breakpoint 1, node::crypto::RandomBytesRequest::RandomBytesRequest (this=0x30a14298, env=0x309f9198, object=..., size=0) at ../src/node_crypto.cc:5378
5378 data_ = static_cast<char*>(malloc(size));
(gdb) print malloc
$1 = {<text variable, no debug info>} 0xd012a160 <malloc>
(gdb) s
5379 if (data() == nullptr)
(gdb) print datat_
^CNo symbol "datat_" in current context.
(gdb) print data_
$2 = (<invalid type code 8> *) 0x0
(gdb) print malloc
$3 = {<text variable, no debug info>} 0xd012a160 <malloc>
(gdb)
My breakpoint on malloc
succeeds and the program functions without a hitch
Breakpoint 2, 0xd012a17c in malloc () from /usr/lib/libc.a(shr.o)
(gdb) where
#0 0xd012a17c in malloc () from /usr/lib/libc.a(shr.o)
#1 0xd031d9dc in __linux_malloc () from /usr/lib/libc.a(shr.o)
#2 0x10e13b6c in node::crypto::RandomBytesRequest::RandomBytesRequest (this=0x309e1c18, env=0x309c6118, object=..., size=0) at ../src/node_crypto.cc:5377
#3 0x10e14968 in node::crypto::RandomBytes (args=0x2ff22064) at ../src/node_crypto.cc:5499
#4 0x115ed608 in v8::internal::FunctionCallbackArguments::Call (this=0x2ff22198, f=<incomplete type>) at ../deps/v8/src/arguments.cc:33
#5 0x10248d6c in v8::internal::(anonymous namespace)::HandleApiCallHelper<false> (isolate=0x3099c118, args=...) at ../deps/v8/src/builtins.cc:3915
#6 0x10245630 in v8::internal::Builtin_Impl_HandleApiCall (args=..., isolate=0x3099c118) at ../deps/v8/src/builtins.cc:3939
#7 0x10245538 in v8::internal::Builtin_HandleApiCall (args_length=3, args_object=0x2ff22348, isolate=0x3099c118) at ../deps/v8/src/builtins.cc:3936
#8 0xef70c74c in ?? ()
Breakpoint 1, node::crypto::RandomBytesRequest::RandomBytesRequest (this=0x309e0d68, env=0x309c60b8, object=..., size=0) at ../src/node_crypto.cc:5378
5378 data_ = static_cast<char*>(malloc(size));
(gdb) s
5379 if (data() == nullptr)
(gdb) print data_
$1 = (<invalid type code 8> *) 0x309db7d8 ""
(gdb)
Note we have a valid pointer in this case
RandomBytesRequest
I decided to check the assembly to see what was going on in both cases we are branching at +108 offset.
Breakpoint 1, node::crypto::RandomBytesRequest::RandomBytesRequest (this=0x30a14298, env=0x309f9198, object=..., size=0) at ../src/node_crypto.cc:5378
5378 data_ = static_cast<char*>(malloc(size));
(gdb) disass $pc
Dump of assembler code for function _ZN4node6crypto18RandomBytesRequestC1EPNS_11EnvironmentEN2v85LocalINS4_6ObjectEEEm:
0x10c09eb4 <+0>: mflr r0
0x10c09eb8 <+4>: stw r0,8(r1)
0x10c09ebc <+8>: stw r31,-4(r1)
0x10c09ec0 <+12>: stwu r1,-64(r1)
0x10c09ec4 <+16>: mr r31,r1
0x10c09ec8 <+20>: stw r3,88(r31)
0x10c09ecc <+24>: stw r4,92(r31)
0x10c09ed0 <+28>: stw r5,96(r31)
0x10c09ed4 <+32>: stw r6,100(r31)
0x10c09ed8 <+36>: lwz r9,88(r31)
0x10c09edc <+40>: mr r3,r9
0x10c09ee0 <+44>: lwz r4,92(r31)
0x10c09ee4 <+48>: lwz r5,96(r31)
0x10c09ee8 <+52>: li r6,1
0x10c09eec <+56>: li r7,0
0x10c09ef0 <+60>: bl 0x10054a48 <_ZN4node9AsyncWrapC2EPNS_11EnvironmentEN2v85LocalINS3_6ObjectEEENS0_12ProviderTypeEPS0_>
0x10c09ef4 <+64>: nop
0x10c09ef8 <+68>: lwz r9,88(r31)
0x10c09efc <+72>: b 0x10c2810c <@FIX30+5764>
0x10c09f00 <+76>: stw r10,0(r9)
0x10c09f04 <+80>: lwz r9,88(r31)
0x10c09f08 <+84>: li r10,0
0x10c09f0c <+88>: stw r10,88(r9)
0x10c09f10 <+92>: lwz r9,88(r31)
0x10c09f14 <+96>: lwz r10,100(r31)
0x10c09f18 <+100>: stw r10,92(r9)
=> 0x10c09f1c <+104>: lwz r3,100(r31)
0x10c09f20 <+108>: bl 0x1006fa84
0x10c09f24 <+112>: lwz r2,20(r1)
0x10c09f28 <+116>: mr r9,r3
(gdb) disass 0x1006fa84
Dump of assembler code for function malloc:
0x1006fa84: lwz r12,636(r2)
0x1006fa88: stw r2,20(r1)
0x1006fa8c: lwz r0,0(r12)
0x1006fa90: lwz r2,4(r12)
0x1006fa94: mtctr r0
0x1006fa98: bctr
0x1006fa9c: .long 0x0
0x1006faa0: .long 0xca000
0x1006faa4: .long 0x0
0x1006faa8: .long 0x18
End of assembler dump.
(gdb)
Breakpoint 1, node::crypto::RandomBytesRequest::RandomBytesRequest (this=0x30a14298, env=0x309f9198, object=..., size=0) at ../src/node_crypto.cc:5378
5378 data_ = static_cast<char*>(malloc(size));
(gdb) disass $pc
Dump of assembler code for function _ZN4node6crypto18RandomBytesRequestC1EPNS_11EnvironmentEN2v85LocalINS4_6ObjectEEEm:
0x10e13afc <+0>: mflr r0
0x10e13b00 <+4>: stw r0,8(r1)
0x10e13b04 <+8>: stw r31,-4(r1)
0x10e13b08 <+12>: stwu r1,-64(r1)
0x10e13b0c <+16>: mr r31,r1
0x10e13b10 <+20>: stw r3,88(r31)
0x10e13b14 <+24>: stw r4,92(r31)
0x10e13b18 <+28>: stw r5,96(r31)
0x10e13b1c <+32>: stw r6,100(r31)
0x10e13b20 <+36>: lwz r9,88(r31)
0x10e13b24 <+40>: mr r3,r9
0x10e13b28 <+44>: lwz r4,92(r31)
0x10e13b2c <+48>: lwz r5,96(r31)
0x10e13b30 <+52>: li r6,1
0x10e13b34 <+56>: li r7,0
0x10e13b38 <+60>: bl 0x100542fc <_ZN4node9AsyncWrapC2EPNS_11EnvironmentEN2v85LocalINS3_6ObjectEEENS0_12ProviderTypeEPS0_>
0x10e13b3c <+64>: nop
0x10e13b40 <+68>: lwz r9,88(r31)
0x10e13b44 <+72>: b 0x10e32074 <@FIX31+6580>
0x10e13b48 <+76>: stw r10,0(r9)
0x10e13b4c <+80>: lwz r9,88(r31)
0x10e13b50 <+84>: li r10,0
0x10e13b54 <+88>: stw r10,88(r9)
0x10e13b58 <+92>: lwz r9,88(r31)
0x10e13b5c <+96>: lwz r10,100(r31)
0x10e13b60 <+100>: stw r10,92(r9)
=> 0x10e13b64 <+104>: lwz r3,100(r31)
0x10e13b68 <+108>: bl 0x107e476c
0x10e13b6c <+112>: lwz r2,20(r1)
0x10e13b70 <+116>: mr r10,r3
(gdb) disass 0x107e476c
Dump of assembler code for function __linux_malloc:
0x107e476c: lwz r12,-9232(r2)
0x107e4770: stw r2,20(r1)
0x107e4774: lwz r0,0(r12)
0x107e4778: lwz r2,4(r12)
0x107e477c: mtctr r0
0x107e4780: bctr
0x107e4784: .long 0x0
0x107e4788: .long 0xca000
0x107e478c: .long 0x0
0x107e4790: .long 0x18
End of assembler dump.
(gdb)
So for some reason when we are compiling and linking against the inspector we end calling the "bad" malloc malloc
instead of the "good" malloc __linux_malloc
.
I'm not too sure why this is happening as the only thing I am doing is
./configure
gmake -C out BUILDTYPE=Debug
//gdb stuff here
./configure --without-inspector
gmake -C out BUILDTYPE=Debug
//repeat gdb stuff here
Not even running gmake clean
inbetween.
In both cases we have a reference to __linux_malloc
in our binary, yet when we have the inspector we are not calling it
Still need to figure out why this is the case. Though I have definitely proven that with the inspector __linux_malloc
is never being called
So here's an objdump of node_crypto.o compile with and without the inspector
imrani@dal-pacha /tmp/imranbuild/node objdump -t ./out/Debug/obj.target/node_base/src/node_crypto.o | grep malloc
[732](sec 0)(fl 0x00)(ty 0)(scl 2) (nx 1) 0x00000000 .__linux_malloc
[2699](sec -2)(fl 0x00)(ty 0)(scl 140) (nx 0) 0x00000000 uv_malloc_func:t1387=1388=*1389=f86
[3617](sec -2)(fl 0x00)(ty 0)(scl 140) (nx 0) 0x00000000 dyn_MEM_malloc_cb:t2519=1388
[3620](sec -2)(fl 0x00)(ty 0)(scl 140) (nx 0) 0x00000000 st_dynamic_MEM_fns:T2522=s12malloc_cb:2519,0,32;realloc_cb:2520,32,32;free_cb:2521,64,32;;
imrani@dal-pacha /tmp/imranbuild/node objdump -r ./out/Debug/obj.target/node_base/src/node_crypto.o | grep malloc
00015608 R_RBR_26 .__linux_malloc
0001dd90 R_RBR_26 .__linux_malloc
0001e054 R_RBR_26 .__linux_malloc
0001e27c R_RBR_26 .__linux_malloc
0001ed14 R_RBR_26 .__linux_malloc
0001fe88 R_RBR_26 .__linux_malloc
0001ff3c R_RBR_26 .__linux_malloc
00020688 R_RBR_26 .__linux_malloc
00021e00 R_RBR_26 .__linux_malloc
000270fc R_RBR_26 .__linux_malloc
0002d0e8 R_RBR_26 .__linux_malloc
0003aeb0 R_RBR_26 .__linux_malloc
imrani@dal-pacha /tmp/imranbuild/node objdump -t ./out/Debug/obj.target/node_base/src/node_crypto.o | grep malloc
[738](sec 0)(fl 0x00)(ty 0)(scl 2) (nx 1) 0x00000000 .malloc
[950](sec 0)(fl 0x00)(ty 0)(scl 2) (nx 1) 0x00000000 .__linux_malloc
[2707](sec -2)(fl 0x00)(ty 0)(scl 140) (nx 0) 0x00000000 uv_malloc_func:t1387=1388=*1389=f86
[4180](sec -2)(fl 0x00)(ty 0)(scl 140) (nx 0) 0x00000000 dyn_MEM_malloc_cb:t2889=1388
[4183](sec -2)(fl 0x00)(ty 0)(scl 140) (nx 0) 0x00000000 st_dynamic_MEM_fns:T2892=s12malloc_cb:2889,0,32;realloc_cb:2890,32,32;free_cb:2891,64,32;;
imrani@dal-pacha /tmp/imranbuild/node objdump -r ./out/Debug/obj.target/node_base/src/node_crypto.o | grep malloc
000156f0 R_RBR_26 .malloc
0001de7c R_RBR_26 .malloc
0001e144 R_RBR_26 .malloc
0001e370 R_RBR_26 .malloc
0001ee0c R_RBR_26 .malloc
0001ff84 R_RBR_26 .malloc
0002003c R_RBR_26 .malloc
0002078c R_RBR_26 .malloc
00021f08 R_RBR_26 .malloc
00027208 R_RBR_26 .__linux_malloc
0002d1f4 R_RBR_26 .malloc
0003afc0 R_RBR_26 .malloc
So the issue seems to be the -DHAVE_INSPECTOR
flag. The only difference between these two compiles is in one we have -DHAVE_INSPECTOR=1
and the other -DHAVE_INSPECTOR=0
imrani@dal-pacha /tmp/imranbuild/node/out /usr/bin/g++ '-DNODE_ARCH="ppc"' '-DNODE_PLATFORM="aix"' '-DNODE_WANT_INTERNALS=1' '-DV8_DEPRECATION_WARNINGS=1' '-DNODE_HAVE_I18N_SUPPORT=1' '-DNODE_HAVE_SMALL_ICU=1' '-DHAVE_INSPECTOR=1' '-DHAVE_OPENSSL=1' '-D__POSIX__' '-D_LINUX_SOURCE_COMPAT=1' '-D_ALL_SOURCE=1' '-DUCONFIG_NO_TRANSLITERATION=1' '-DUCONFIG_NO_SERVICE=1' '-DUCONFIG_NO_REGULAR_EXPRESSIONS=1' '-DU_ENABLE_DYLOAD=0' '-DU_STATIC_IMPLEMENTATION=1' '-DU_HAVE_STD_STRING=0' '-DUCONFIG_NO_BREAK_ITERATION=0' '-DUCONFIG_NO_CONVERSION=1' '-DHTTP_PARSER_STRICT=0' '-D_LARGEFILE_SOURCE' '-D_FILE_OFFSET_BITS=64' '-DDEBUG' '-D_DEBUG' -I../src -I../tools/msvs/genfiles -I../deps/uv/src/ares -I/tmp/imranbuild/node/out/Debug/obj/gen -I../deps/v8 -I../deps/v8/include -I../deps/icu-small/source/i18n -I../deps/icu-small/source/common -I../deps/openssl/openssl/include -I../deps/zlib -I../deps/http_parser -I../deps/cares/include -I../deps/uv/include -pthread -Wall -Wextra -Wno-unused-parameter -g -O0 -gxcoff -fno-rtti -fno-exceptions -std=gnu++0x -MMD -MF /tmp/imranbuild/node/out/Debug/.deps//tmp/imranbuild/node/out/Debug/obj.target/node_base/src/node_crypto.o.d.raw -c -o /tmp/imranbuild/node/out/Debug/obj.target/node_base/src/node_crypto.o ../src/node_crypto.cc
imrani@dal-pacha /tmp/imranbuild/node/out objdump -tr ./Debug/obj.target/node_base/src/node_crypto.o | grep malloc
[738](sec 0)(fl 0x00)(ty 0)(scl 2) (nx 1) 0x00000000 .malloc
[950](sec 0)(fl 0x00)(ty 0)(scl 2) (nx 1) 0x00000000 .__linux_malloc
[2707](sec -2)(fl 0x00)(ty 0)(scl 140) (nx 0) 0x00000000 uv_malloc_func:t1387=1388=*1389=f86
[4180](sec -2)(fl 0x00)(ty 0)(scl 140) (nx 0) 0x00000000 dyn_MEM_malloc_cb:t2889=1388
[4183](sec -2)(fl 0x00)(ty 0)(scl 140) (nx 0) 0x00000000 st_dynamic_MEM_fns:T2892=s12malloc_cb:2889,0,32;realloc_cb:2890,32,32;free_cb:2891,64,32;;
000156f0 R_RBR_26 .malloc
0001de7c R_RBR_26 .malloc
0001e144 R_RBR_26 .malloc
0001e370 R_RBR_26 .malloc
0001ee0c R_RBR_26 .malloc
0001ff84 R_RBR_26 .malloc
0002003c R_RBR_26 .malloc
0002078c R_RBR_26 .malloc
00021f08 R_RBR_26 .malloc
00027208 R_RBR_26 .__linux_malloc
0002d1f4 R_RBR_26 .malloc
0003afc0 R_RBR_26 .malloc
imrani@dal-pacha /tmp/imranbuild/node/out /usr/bin/g++ '-DNODE_ARCH="ppc"' '-DNODE_PLATFORM="aix"' '-DNODE_WANT_INTERNALS=1' '-DV8_DEPRECATION_WARNINGS=1' '-DNODE_HAVE_I18N_SUPPORT=1' '-DNODE_HAVE_SMALL_ICU=1' '-DHAVE_INSPECTOR=0' '-DHAVE_OPENSSL=1' '-D__POSIX__' '-D_LINUX_SOURCE_COMPAT=1' '-D_ALL_SOURCE=1' '-DUCONFIG_NO_TRANSLITERATION=1' '-DUCONFIG_NO_SERVICE=1' '-DUCONFIG_NO_REGULAR_EXPRESSIONS=1' '-DU_ENABLE_DYLOAD=0' '-DU_STATIC_IMPLEMENTATION=1' '-DU_HAVE_STD_STRING=0' '-DUCONFIG_NO_BREAK_ITERATION=0' '-DUCONFIG_NO_CONVERSION=1' '-DHTTP_PARSER_STRICT=0' '-D_LARGEFILE_SOURCE' '-D_FILE_OFFSET_BITS=64' '-DDEBUG' '-D_DEBUG' -I../src -I../tools/msvs/genfiles -I../deps/uv/src/ares -I/tmp/imranbuild/node/out/Debug/obj/gen -I../deps/v8 -I../deps/v8/include -I../deps/icu-small/source/i18n -I../deps/icu-small/source/common -I../deps/openssl/openssl/include -I../deps/zlib -I../deps/http_parser -I../deps/cares/include -I../deps/uv/include -pthread -Wall -Wextra -Wno-unused-parameter -g -O0 -gxcoff -fno-rtti -fno-exceptions -std=gnu++0x -MMD -MF /tmp/imranbuild/node/out/Debug/.deps//tmp/imranbuild/node/out/Debug/obj.target/node_base/src/node_crypto.o.d.raw -c -o /tmp/imranbuild/node/out/Debug/obj.target/node_base/src/node_crypto.o ../src/node_crypto.cc
imrani@dal-pacha /tmp/imranbuild/node/out objdump -tr ./Debug/obj.target/node_base/src/node_crypto.o | grep malloc[732](sec 0)(fl 0x00)(ty 0)(scl 2) (nx 1) 0x00000000 .__linux_malloc
[2699](sec -2)(fl 0x00)(ty 0)(scl 140) (nx 0) 0x00000000 uv_malloc_func:t1387=1388=*1389=f86
[3617](sec -2)(fl 0x00)(ty 0)(scl 140) (nx 0) 0x00000000 dyn_MEM_malloc_cb:t2519=1388
[3620](sec -2)(fl 0x00)(ty 0)(scl 140) (nx 0) 0x00000000 st_dynamic_MEM_fns:T2522=s12malloc_cb:2519,0,32;realloc_cb:2520,32,32;free_cb:2521,64,32;;
00015608 R_RBR_26 .__linux_malloc
0001dd90 R_RBR_26 .__linux_malloc
0001e054 R_RBR_26 .__linux_malloc
0001e27c R_RBR_26 .__linux_malloc
0001ed14 R_RBR_26 .__linux_malloc
0001fe88 R_RBR_26 .__linux_malloc
0001ff3c R_RBR_26 .__linux_malloc
00020688 R_RBR_26 .__linux_malloc
00021e00 R_RBR_26 .__linux_malloc
000270fc R_RBR_26 .__linux_malloc
0002d0e8 R_RBR_26 .__linux_malloc
0003aeb0 R_RBR_26 .__linux_malloc
Removing the following lines and compiling again with -DHAVE_INSPECTOR=1
causes the problem to disappear
diff --git a/src/env-inl.h b/src/env-inl.h
index 97e1ba8..34f9bf7 100644
--- a/src/env-inl.h
+++ b/src/env-inl.h
@@ -225,9 +225,6 @@ inline Environment::Environment(v8::Local<v8::Context> context,
makecallback_cntr_(0),
async_wrap_uid_(0),
debugger_agent_(this),
-#if HAVE_INSPECTOR
- inspector_agent_(this),
-#endif
http_parser_buffer_(nullptr),
context_(context->GetIsolate(), context) {
// We'll be creating new objects so make sure we've entered the context.
diff --git a/src/env.h b/src/env.h
index 4c310c8..b2f0b27 100644
--- a/src/env.h
+++ b/src/env.h
@@ -5,9 +5,6 @@
#include "ares.h"
#include "debug-agent.h"
-#if HAVE_INSPECTOR
-#include "inspector_agent.h"
-#endif
#include "handle_wrap.h"
#include "req-wrap.h"
#include "tree.h"
@@ -552,11 +549,6 @@ class Environment {
return &debugger_agent_;
}
-#if HAVE_INSPECTOR
- inline inspector::Agent* inspector_agent() {
- return &inspector_agent_;
- }
-#endif
typedef ListHead<HandleWrap, &HandleWrap::handle_wrap_queue_> HandleWrapQueue;
typedef ListHead<ReqWrap<uv_req_t>, &ReqWrap<uv_req_t>::req_wrap_queue_>
@@ -595,9 +587,6 @@ class Environment {
size_t makecallback_cntr_;
int64_t async_wrap_uid_;
debugger::Agent debugger_agent_;
-#if HAVE_INSPECTOR
- inspector::Agent inspector_agent_;
-#endif
HandleWrapQueue handle_wrap_queue_;
ReqWrapQueue req_wrap_queue_;
imrani@dal-pacha /tmp/imranbuild/node/out vi ../src/env.h
imrani@dal-pacha /tmp/imranbuild/node/out /usr/bin/g++ '-DNODE_ARCH="ppc"' '-DNODE_PLATFORM="aix"' '-DNODE_WANT_INTERNALS=1' '-DV8_DEPRECATION_WARNINGS=1' '-DNODE_HAVE_I18N_SUPPORT=1' '-DNODE_HAVE_SMALL_ICU=1' '-DHAVE_INSPECTOR=1' '-DHAVE_OPENSSL=1' '-D__POSIX__' '-D_LINUX_SOURCE_COMPAT=1' '-D_ALL_SOURCE=1' '-DUCONFIG_NO_TRANSLITERATION=1' '-DUCONFIG_NO_SERVICE=1' '-DUCONFIG_NO_REGULAR_EXPRESSIONS=1' '-DU_ENABLE_DYLOAD=0' '-DU_STATIC_IMPLEMENTATION=1' '-DU_HAVE_STD_STRING=0' '-DUCONFIG_NO_BREAK_ITERATION=0' '-DUCONFIG_NO_CONVERSION=1' '-DHTTP_PARSER_STRICT=0' '-D_LARGEFILE_SOURCE' '-D_FILE_OFFSET_BITS=64' '-DDEBUG' '-D_DEBUG' -I../src -I../tools/msvs/genfiles -I../deps/uv/src/ares -I/tmp/imranbuild/node/out/Debug/obj/gen -I../deps/v8 -I../deps/v8/include -I../deps/icu-small/source/i18n -I../deps/icu-small/source/common -I../deps/openssl/openssl/include -I../deps/zlib -I../deps/http_parser -I../deps/cares/include -I../deps/uv/include -pthread -Wall -Wextra -Wno-unused-parameter -g -O0 -gxcoff -fno-rtti -fno-exceptions -std=gnu++0x -MMD -MF /tmp/imranbuild/node/out/Debug/.deps//tmp/imranbuild/node/out/Debug/obj.target/node_base/src/node_crypto.o.d.raw -c -o /tmp/imranbuild/node/out/Debug/obj.target/node_base/src/node_crypto.o ../src/node_crypto.cc
In file included from ../src/base-object-inl.h:8:0,
from ../src/async-wrap-inl.h:8,
from ../src/node_crypto.h:14,
from ../src/node_crypto.cc:3:
../src/env-inl.h: In constructor 'node::Environment::Environment(v8::Local<v8::Context>, uv_loop_t*)':
../src/env-inl.h:229:7: error: class 'node::Environment' does not have any field named 'inspector_agent_'
inspector_agent_(this),
^
imrani@dal-pacha /tmp/imranbuild/node/out vi ../src/env-inl.h
imrani@dal-pacha /tmp/imranbuild/node/out /usr/bin/g++ '-DNODE_ARCH="ppc"' '-DNODE_PLATFORM="aix"' '-DNODE_WANT_INTERNALS=1' '-DV8_DEPRECATION_WARNINGS=1' '-DNODE_HAVE_I18N_SUPPORT=1' '-DNODE_HAVE_SMALL_ICU=1' '-DHAVE_INSPECTOR=1' '-DHAVE_OPENSSL=1' '-D__POSIX__' '-D_LINUX_SOURCE_COMPAT=1' '-D_ALL_SOURCE=1' '-DUCONFIG_NO_TRANSLITERATION=1' '-DUCONFIG_NO_SERVICE=1' '-DUCONFIG_NO_REGULAR_EXPRESSIONS=1' '-DU_ENABLE_DYLOAD=0' '-DU_STATIC_IMPLEMENTATION=1' '-DU_HAVE_STD_STRING=0' '-DUCONFIG_NO_BREAK_ITERATION=0' '-DUCONFIG_NO_CONVERSION=1' '-DHTTP_PARSER_STRICT=0' '-D_LARGEFILE_SOURCE' '-D_FILE_OFFSET_BITS=64' '-DDEBUG' '-D_DEBUG' -I../src -I../tools/msvs/genfiles -I../deps/uv/src/ares -I/tmp/imranbuild/node/out/Debug/obj/gen -I../deps/v8 -I../deps/v8/include -I../deps/icu-small/source/i18n -I../deps/icu-small/source/common -I../deps/openssl/openssl/include -I../deps/zlib -I../deps/http_parser -I../deps/cares/include -I../deps/uv/include -pthread -Wall -Wextra -Wno-unused-parameter -g -O0 -gxcoff -fno-rtti -fno-exceptions -std=gnu++0x -MMD -MF /tmp/imranbuild/node/out/Debug/.deps//tmp/imranbuild/node/out/Debug/obj.target/node_base/src/node_crypto.o.d.raw -c -o /tmp/imranbuild/node/out/Debug/obj.target/node_base/src/node_crypto.o ../src/node_crypto.cc
imrani@dal-pacha /tmp/imranbuild/node/out objdump -tr ./Debug/obj.target/node_base/src/node_crypto.o | grep malloc
[732](sec 0)(fl 0x00)(ty 0)(scl 2) (nx 1) 0x00000000 .__linux_malloc
[2699](sec -2)(fl 0x00)(ty 0)(scl 140) (nx 0) 0x00000000 uv_malloc_func:t1387=1388=*1389=f86
[3617](sec -2)(fl 0x00)(ty 0)(scl 140) (nx 0) 0x00000000 dyn_MEM_malloc_cb:t2519=1388
[3620](sec -2)(fl 0x00)(ty 0)(scl 140) (nx 0) 0x00000000 st_dynamic_MEM_fns:T2522=s12malloc_cb:2519,0,32;realloc_cb:2520,32,32;free_cb:2521,64,32;;
00015608 R_RBR_26 .__linux_malloc
0001dd90 R_RBR_26 .__linux_malloc
0001e054 R_RBR_26 .__linux_malloc
0001e27c R_RBR_26 .__linux_malloc
0001ed14 R_RBR_26 .__linux_malloc
0001fe88 R_RBR_26 .__linux_malloc
0001ff3c R_RBR_26 .__linux_malloc
00020688 R_RBR_26 .__linux_malloc
00021e00 R_RBR_26 .__linux_malloc
000270fc R_RBR_26 .__linux_malloc
0002d0e8 R_RBR_26 .__linux_malloc
0003aeb0 R_RBR_26 .__linux_malloc
So having an inspector agent causes us to use the wrong malloc
I think this may have to do with the inspector_agent.h.
That header uses std::vector. Could it be that it is STL that pulls in the wrong malloc?
Looks like you're right
Taken from /opt/freeware/lib/gcc/powerpc-ibm-aix6.1.0.0/4.8.4/include/c++/cstlib
include <stdlib.h>
// Get rid of those macros defined in <stdlib.h> in lieu of real functions.
#undef abort
#undef abs
#undef atexit
#if __cplusplus >= 201103L
# ifdef _GLIBCXX_HAVE_AT_QUICK_EXIT
# undef at_quick_exit
# endif
#endif
#undef atof
#undef atoi
#undef atol
#undef bsearch
#undef calloc
#undef div
#undef exit
#undef free
#undef getenv
#undef labs
#undef ldiv
#undef malloc
#undef mblen
#undef mbstowcs
#undef mbtowc
#undef qsort
Ok, I'll do a change tomorrow that removes the vector from the header.
On Thu, Jun 2, 2016 at 1:29 PM Imran Iqbal notifications@github.com wrote:
Looks like you're right Taken from /opt/freeware/lib/gcc/powerpc-ibm-aix6.1.0.0/4.8.4/include/c++/cstlib
include
// Get rid of those macros defined in
in lieu of real functions. undef abort
undef abs
undef atexit
if __cplusplus >= 201103L
ifdef _GLIBCXX_HAVE_AT_QUICK_EXIT
undef at_quick_exit
endif
endif
undef atof
undef atoi
undef atol
undef bsearch
undef calloc
undef div
undef exit
undef free
undef getenv
undef labs
undef ldiv
undef malloc
undef mblen
undef mbstowcs
undef mbtowc
undef qsort
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/nodejs/node/issues/7080#issuecomment-223412135, or mute the thread https://github.com/notifications/unsubscribe/AARkrZqMRtqbWBT1lUYSxQefp5aDVfrwks5qHz0_gaJpZM4IqzAe .
I am a little bit confused, there are a lot of other files that also #include <vector>
, e.g. node.cc, node_file.cc, and code in V8. Why is this a problem only for inspector_agent.h
?
I think it might have to do with the inspector_socket.h being a header included elsewhere.
On Thu, Jun 2, 2016 at 1:42 PM Ali Ijaz Sheikh notifications@github.com wrote:
I am a little bit confused, there are a lot of other files that also #include
, e.g. node.cc, node_file.cc, and code in V8. Why is this a problem only for inspector_agent.h? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/nodejs/node/issues/7080#issuecomment-223415622, or mute the thread https://github.com/notifications/unsubscribe/AARkrTg_HYJmDwB4cl-H1kdwGFmSzJjoks5qH0A7gaJpZM4IqzAe .
So this raises a few questions:
malloc
interspersed with __linux_malloc
. This could be problematic in the future?crypto.randomBytes
because it has an async callback, but they pass 0 as the number of bytes they want. Is this a valid use of the api?malloc(0)
. Currently we assume that all mallocs behave the same and do not guard against this condition. Is this something we should check for in the configure step or guard against in code?Can you try this - https://github.com/eugeneo/node/commit/c367115f4b0a3b42e233a899c4f0469feebf908b
This just a proof of concept, actual implementation will be cleaner.
On Thu, Jun 2, 2016 at 3:08 PM Imran Iqbal notifications@github.com wrote:
So this raises a few questions:
- Uses of different mallocs have to have different free calls, in the failing case we see that we have malloc interspersed with __linux_malloc. This could be problematic in the future?
- The tests themselves use crypto.randomBytes because it has an async callback, but they pass 0 as the number of bytes they want. Is this a valid use of the api?
- There exists a notion of glibc compatible malloc for example autoconf includes a test for this compatibililty (source) https://github.com/coreutils/gnulib/blob/master/m4/malloc.m4. The test checks exactly the behaviour we see in this case, ie. checking for null on malloc(0). Currently we assume that all mallocs behave the same and do not guard against this condition. Is this something we should check for in the configure step or guard against in code?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/nodejs/node/issues/7080#issuecomment-223437180, or mute the thread https://github.com/notifications/unsubscribe/AARkrbxkAAsKI27eTmUU9XqQAVxRkOeFks5qH1RJgaJpZM4IqzAe .
https://ci.nodejs.org/job/node-test-commit-aix/205/ Hopefully i set the parameters properly there
@eugeneo that changeset appears to have fixed the issue
Testing out the following changeset to see if it sovles the issue: https://github.com/iWuzHere/node/commit/5966f16057bd27d381cc1681dc963033d57dc0c5
https://ci.nodejs.org/job/node-test-commit-aix/nodes=aix61-ppc64/207/
Based on the investigation (myself and iWuzHere) our understanding is that the core issue is a problem with how _LINUX_SOURCE_COMPAT was implemented for AIX. We've talked to the team that handles gcc for AIX and it sounds like they will update the toolchain so that it defines malloc (and variants) in a better way that won't conflict with what the STL header was doing. This is however, a longer term solution as it will take a while for this to make it into the toolchain and us to upgrade to later levels of the compiler.
In the shorter term it looks like we have 2 options. One is to refactor the inspector code to avoid the problem as @eugeneo has been working on. The second is to update the use of malloc (and variants) in Node as @iWuzHere has done in https://github.com/iWuzHere/node/commit/42e12bfcc9b7bd23f37fe37e8fe8d0d29e63e00e. I think I'd favor the second option if we can get it accepted since it would cover other similar issues if they arise before the toolchain is updated. @eugeneo and @ofrobots whats your take ?
I think I'd like to refactor the inspector in either case, even if not for AIX issue.
Ok so sounds like both would likely make sense. @iWuzHere can you submit the PR referencing this issue. I'd include the information in the pull request that the longer term change will be in the toolchain but this will make sure we don't see other similar issues up until the toolchain is updated.
Created pull request #7229
3 new failures in last nights run, opening as single issue until more investigation is completed.
https://ci.nodejs.org/job/node-test-commit-aix/nodes=aix61-ppc64/195/console