strongloop / strong-oracle

Deprecated: Node.js Driver for Oracle databases (Use https://github.com/oracle/node-oracledb instead)
Other
45 stars 18 forks source link

malloc(): smallbin double linked list corrupted #33

Closed einfallstoll closed 9 years ago

einfallstoll commented 9 years ago

There must be some error in strong-oracle or node.js. I use a generic-pool for connection pooling (I don't trust the built in connection pool, since I made very bad experiences in the past) and make a very simple request on one of our database. All connection options are set to default and there is just one connection.

*** glibc detected *** node: malloc(): smallbin double linked list corrupted: 0x00000000010de370 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x79088)[0x7fe5a1f99088]
/lib64/libc.so.6(+0x7c6cd)[0x7fe5a1f9c6cd]
/lib64/libc.so.6(__libc_malloc+0x77)[0x7fe5a1f9e1b7]
/usr/lib64/libstdc++.so.6(_Znwm+0x1d)[0x7fe5a29ac08d]
/opt/nodejs/servmon/node_modules/strong-oracle/build/Release/oracle_bindings.node(_ZNSt6vectorIPvSaIS0_EE13_M_insert_auxEN9__gnu_cxx17__normal_iteratorIPS0_S2_EERKS0_+0x150)[0x7fe59f4f7960]
/opt/nodejs/servmon/node_modules/strong-oracle/build/Release/oracle_bindings.node(_ZN10Connection32CreateRowFromCurrentResultSetRowEPN6oracle4occi9ResultSetERSt6vectorIP8column_tSaIS6_EE+0x392)[0x7fe59f4ef842]
/opt/nodejs/servmon/node_modules/strong-oracle/build/Release/oracle_bindings.node(_ZN10Connection16ExecuteStatementEP12ExecuteBatonPN6oracle4occi9StatementE+0x5a8)[0x7fe59f4f2dc8]
/opt/nodejs/servmon/node_modules/strong-oracle/build/Release/oracle_bindings.node(_ZN10Connection11EIO_ExecuteEP9uv_work_s+0x4b)[0x7fe59f4f30bb]
node[0x9e1470]
node[0x9d6c81]
/lib64/libpthread.so.0(+0x7806)[0x7fe5a22a3806]
/lib64/libc.so.6(clone+0x6d)[0x7fe5a1ffe64d]
======= Memory map: ========
00400000-00c8e000 r-xp 00000000 fd:06 243331                             /usr/local/bin/node
00e8e000-00ea5000 rwxp 0088e000 fd:06 243331                             /usr/local/bin/node
00ea5000-01366000 rwxp 00000000 00:00 0                                  [heap]
393b2a00000-393b2b00000 rwxp 00000000 00:00 0
42daa400000-42daa485000 rwxp 00000000 00:00 0
4fa24b00000-4fa24c00000 rwxp 00000000 00:00 0
7f153ea5000-7f1542a5000 rwxp 00000000 00:00 0
850161d0000-850161d1000 r-xp 00000000 00:00 0
aa6e1400000-aa6e1500000 rwxp 00000000 00:00 0
b1fa0500000-b1fa0600000 rwxp 00000000 00:00 0
1099ff400000-1099ff500000 rwxp 00000000 00:00 0
1248db200000-1248db300000 rwxp 00000000 00:00 0
134874b91000-134874b92000 r-xp 00000000 00:00 0
189f44400000-189f44500000 rwxp 00000000 00:00 0
1989bca00000-1989bcb00000 rwxp 00000000 00:00 0
1c4b55c54000-1c4b55c55000 r-xp 00000000 00:00 0
1e51a0400000-1e51a0425000 rwxp 00000000 00:00 0
1e5c71600000-1e5c71625000 rwxp 00000000 00:00 0
2009fa133000-2009fa333000 rwxp 00000000 00:00 0
231eec600000-231eec700000 rwxp 00000000 00:00 0
252ada000000-252adb800000 rwxp 00000000 00:00 0
252adb800000-252adc000000 rwxp 00000000 00:00 0
265a79100000-265a79200000 rwxp 00000000 00:00 0
26b859400000-26b859500000 rwxp 00000000 00:00 0
277cc9d00000-277cc9e00000 rwxp 00000000 00:00 0
29c8dd700000-29c8dd800000 rwxp 00000000 00:00 0
2c3968f00000-2c3969000000 rwxp 00000000 00:00 0
2d973b98e000-2d973b98f000 r-xp 00000000 00:00 0
2e362b5cc000-2e362b5cd000 r-xp 00000000 00:00 0
348745c00000-348745d00000 rwxp 00000000 00:00 0
37dcced82000-37dccedc0000 ---p 00000000 00:00 0
37dccedc0000-37dccede0000 rwxp 00000000 00:00 0
37dccede0000-37dccede2000 ---p 00000000 00:00 0
3afc52400000-3afc52435000 rwxp 00000000 00:00 0
3b8fd4f00000-3b8fd5000000 rwxp 00000000 00:00 0
3dea42300000-3dea42400000 rwxp 00000000 00:00 0
3f4a8f1bc000-3f4a8f200000 ---p 00000000 00:00 0
3f4a8f200000-3f4a8f205000 rwxp 00000000 00:00 0
3f4a8f205000-3f4a8f206000 ---p 00000000 00:00 0
3f4a8f206000-3f4a8f2ff000 rwxp 00000000 00:00 0
3f4a8f2ff000-3f4a8f300000 ---p 00000000 00:00 0
3f4a8f300000-3f4a8f305000 rwxp 00000000 00:00 0
3f4a8f305000-3f4a8f306000 ---p 00000000 00:00 0
3f4a8f306000-3f4a8f3ff000 rwxp 00000000 00:00 0
3f4a8f3ff000-3f4a8f400000 ---p 00000000 00:00 0
3f4a8f400000-3f4a8f405000 rwxp 00000000 00:00 0
3f4a8f405000-3f4a8f406000 ---p 00000000 00:00 0
3f4a8f406000-3f4a8f4ff000 rwxp 00000000 00:00 0
3f4a8f4ff000-3f4a8f500000 ---p 00000000 00:00 0
3f4a8f500000-3f4a8f505000 rwxp 00000000 00:00 0
3f4a8f505000-3f4a8f506000 ---p 00000000 00:00 0
3f4a8f506000-3f4a8f5ff000 rwxp 00000000 00:00 0
3f4a8f5ff000-3f4a8f600000 ---p 00000000 00:00 0
3f4a8f600000-3f4a8f605000 rwxp 00000000 00:00 0
3f4a8f605000-3f4a8f606000 ---p 00000000 00:00 0
3f4a8f606000-3f4a8f6ff000 rwxp 00000000 00:00 0
3f4a8f6ff000-3f4a8f700000 ---p 00000000 00:00 0
3f4a8f700000-3f4a8f705000 rwxp 00000000 00:00 0
3f4a8f705000-3f4a8f706000 ---p 00000000 00:00 0
3f4a8f706000-3f4a8f7ff000 rwxp 00000000 00:00 0
3f4a8f7ff000-3f4aaf1bc000 ---p 00000000 00:00 0
3fd41d700000-3fd41d800000 rwxp 00000000 00:00 0
7fe594000000-7fe594160000 rwxp 00000000 00:00 0
7fe594160000-7fe598000000 ---p 00000000 00:00 0
7fe599d4b000-7fe599d53000 r-xp 00000000 fd:06 131131                     /usr/lib64/libnuma.so.1
7fe599d53000-7fe599f52000 ---p 00008000 fd:06 131131                     /usr/lib64/libnuma.so.1
7fe599f52000-7fe599f53000 r-xp 00007000 fd:06 131131                     /usr/lib64/libnuma.so.1
7fe599f53000-7fe599f54000 rwxp 00008000 fd:06 131131                     /usr/lib64/libnuma.so.1
7fe599f54000-7fe599f60000 r-xp 00000000 08:01 118                        /lib64/libnss_files-2.11.3.so
7fe599f60000-7fe59a15f000 ---p 0000c000 08:01 118                        /lib64/libnss_files-2.11.3.so
7fe59a15f000-7fe59a160000 r-xp 0000b000 08:01 118                        /lib64/libnss_files-2.11.3.so
7fe59a160000-7fe59a161000 rwxp 0000c000 08:01 118                        /lib64/libnss_files-2.11.3.so
7fe59a161000-7fe59a71a000 r-xp 00000000 fd:01 167645                     /opt/oracle/instantclient_64/libociicus.so
7fe59a71a000-7fe59a919000 ---p 005b9000 fd:01 167645                     /opt/oracle/instantclient_64/libociicus.so
7fe59a919000-7fe59a91a000 rwxp 005b8000 fd:01 167645                     /opt/oracle/instantclient_64/libociicus.so
7fe59a91a000-7fe59ac6e000 r-xp 00000000 fd:01 167640                     /opt/oracle/instantclient_64/libclntshcore.so.12.1
7fe59ac6e000-7fe59ae6d000 ---p 00354000 fd:01 167640                     /opt/oracle/instantclient_64/libclntshcore.so.12.1
7fe59ae6d000-7fe59ae88000 rwxp 00353000 fd:01 167640                     /opt/oracle/instantclient_64/libclntshcore.so.12.1
7fe59ae88000-7fe59ae8c000 rwxp 00000000 00:00 0
7fe59ae8c000-7fe59ae8d000 r-xp 00000000 08:01 104                        /lib64/libaio.so.1.0.1
7fe59ae8d000-7fe59b08c000 ---p 00001000 08:01 104                        /lib64/libaio.so.1.0.1
7fe59b08c000-7fe59b08d000 r-xp 00000000 08:01 104                        /lib64/libaio.so.1.0.1
7fe59b08d000-7fe59b08e000 rwxp 00001000 08:01 104                        /lib64/libaio.so.1.0.1
7fe59b08e000-7fe59b0a3000 r-xp 00000000 08:01 96                         /lib64/libnsl-2.11.3.so
7fe59b0a3000-7fe59b2a2000 ---p 00015000 08:01 96                         /lib64/libnsl-2.11.3.so
7fe59b2a2000-7fe59b2a3000 r-xp 00014000 08:01 96                         /lib64/libnsl-2.11.3.so
7fe59b2a3000-7fe59b2a4000 rwxp 00015000 08:01 96                         /lib64/libnsl-2.11.3.so
7fe59b2a4000-7fe59b2a6000 rwxp 00000000 00:00 0
7fe59b2a6000-7fe59b2ea000 r-xp 00000000 fd:01 167647                     /opt/oracle/instantclient_64/libons.so
bnoordhuis commented 9 years ago

Can you post the output of valgrind -q node app.js? Cheers.

einfallstoll commented 9 years ago
==21812== Warning: noted but unhandled ioctl 0x5451 with no size/direction hints
==21812==    This could cause spurious value errors to appear.
==21812==    See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper.
==21812== Warning: noted but unhandled ioctl 0x5451 with no size/direction hints
==21812==    This could cause spurious value errors to appear.
==21812==    See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper.
==21812== Warning: noted but unhandled ioctl 0x5451 with no size/direction hints
==21812==    This could cause spurious value errors to appear.
==21812==    See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper.
vex amd64->IR: unhandled instruction bytes: 0xF 0x29 0xE1 0xE9 0x1B 0xFF 0xFF 0xFF
==21812== valgrind: Unrecognised instruction at address 0x364d807b40cf.
==21812==    at 0x364D807B40CF: ???
==21812==    by 0x364D807AF0E2: ???
==21812==    by 0x364D807AEE25: ???
==21812==    by 0x364D807AEC58: ???
==21812==    by 0x364D8070D1E9: ???
==21812==    by 0x364D807AEA12: ???
==21812==    by 0x364D807AE7F6: ???
==21812==    by 0x364D80797AD2: ???
==21812==    by 0x364D807122BD: ???
==21812==    by 0x364D807640CF: ???
==21812==    by 0x364D8075F184: ???
==21812==    by 0x364D8075B2B5: ???
==21812==    by 0x364D8074A426: ???
==21812==    by 0x364D8070CCCD: ???
==21812==    by 0x364D80766116: ???
==21812==    by 0x364D80765F88: ???
==21812==    by 0x364D80794450: ???
==21812==    by 0x364D807122BD: ???
==21812==    by 0x364D807640CF: ???
==21812==    by 0x364D8075F184: ???
==21812==    by 0x364D8075B2B5: ???
==21812==    by 0x364D8074A426: ???
==21812==    by 0x364D8070CCCD: ???
==21812==    by 0x364D80766116: ???
==21812==    by 0x364D80765F88: ???
==21812==    by 0x364D80764CE5: ???
==21812==    by 0x364D807122BD: ???
==21812==    by 0x364D807640CF: ???
==21812==    by 0x364D8075F184: ???
==21812==    by 0x364D8075B2B5: ???
==21812==    by 0x364D8074A426: ???
==21812==    by 0x364D80749DEA: ???
==21812==    by 0x364D8072E238: ???
==21812==    by 0x364D8072D7BE: ???
==21812==    by 0x364D8070D506: ???
==21812==    by 0x364D80706115: ???
==21812==    by 0x73B901: v8::internal::Invoke(bool, v8::internal::Handle<v8::internal::JSFunction>, v8::internal::Handle<v8::internal::Object>, int, v8::internal::Handle<v8::internal::Object>*, bool*) (in /usr/local/bin/node)
==21812==    by 0x73CD05: v8::internal::Execution::Call(v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Object>, int, v8::internal::Handle<v8::internal::Object>*, bool*, bool) (in /usr/local/bin/node)
==21812==    by 0x6DE33F: v8::Function::Call(v8::Handle<v8::Object>, int, v8::Handle<v8::Value>*) (in /usr/local/bin/node)
==21812==    by 0x9840C9: node::Load(v8::Handle<v8::Object>) (in /usr/local/bin/node)
==21812==    by 0x98427C: node::Start(int, char**) (in /usr/local/bin/node)
==21812==    by 0x5C0DC35: (below main) (in /lib64/libc-2.11.3.so)
==21812== Your program just tried to execute an instruction that Valgrind
==21812== did not recognise.  There are two possible reasons for this.
==21812== 1. Your program has a bug and erroneously jumped to a non-code
==21812==    location.  If you are running Memcheck and you just saw a
==21812==    warning about a bad jump, it's probably your program's fault.
==21812== 2. The instruction is legitimate but Valgrind doesn't handle it,
==21812==    i.e. it's Valgrind's fault.  If you think this is the case or
==21812==    you are not sure, please let us know and we'll try to fix it.
==21812== Either way, Valgrind will now raise a SIGILL signal which will
==21812== probably kill your program.
==21812==
==21812== Process terminating with default action of signal 4 (SIGILL): dumping core
==21812==  Illegal opcode at address 0x364D807B40CF
==21812==    at 0x364D807B40CF: ???
==21812==    by 0x364D807AF0E2: ???
==21812==    by 0x364D807AEE25: ???
==21812==    by 0x364D807AEC58: ???
==21812==    by 0x364D8070D1E9: ???
==21812==    by 0x364D807AEA12: ???
==21812==    by 0x364D807AE7F6: ???
==21812==    by 0x364D80797AD2: ???
==21812==    by 0x364D807122BD: ???
==21812==    by 0x364D807640CF: ???
==21812==    by 0x364D8075F184: ???
==21812==    by 0x364D8075B2B5: ???
Killed
bnoordhuis commented 9 years ago

I infer that you're using an old distro like RHEL 5 or Centos 5? glibc 2.11 is pretty ancient and those unhandled ioctls warnings I fixed back in 2012 so your valgrind is probably none too recent either.

I'm not sure how to go from here. Perhaps you can obtain a more recent version of valgrind from a backports repo?

einfallstoll commented 9 years ago

Linux version and distribution:

Linux version 3.0.101-0.46-default (geeko@buildhost) (gcc version 4.3.4 [gcc-4_3-branch revision 152973] (SUSE Linux) )
SUSE Linux Enterprise Server 11 SP3  (x86_64)

valgrind version:

valgrind-3.7.0

I will try to get a more recent version of valgrind and test again.

einfallstoll commented 9 years ago

Ok. Running valgrind-3.10.1 got me 15k lines of errors (see here). There must be something extremely wrong and I assume it's the instant client itself.

bnoordhuis commented 9 years ago

Now that you mention it... I remember looking into that way back when, thinking we probably initialized the library wrong, but even the simplest OCIEnvCreate(&env, OCI_DEFAULT, NULL, NULL, NULL, NULL, 0, NULL); would spew hundreds of warnings.

einfallstoll commented 9 years ago

What can I do now? My app will crash after a few statements with a segmentation fault and I have no idea where it comes from.

bnoordhuis commented 9 years ago

Can you run your app in gdb a few times (gdb --args node app.js, followed by run) and post the post-crash backtrace (thread apply all backtrace full)? Perhaps there is some kind of pattern to the stack traces that can point us in the right direction.

einfallstoll commented 9 years ago

Ok. I got two slightly different stacktraces: crash 1 and crash 2

bnoordhuis commented 9 years ago

I'm afraid I don't see anything obviously wrong. The first stack trace at least show a memory allocation taking place but I don't see anything like that in the second one.

Do you happen to have a newer Linux system where you can do a quick test on?

einfallstoll commented 9 years ago

Unfortunately no. If I tested it against some other Linux we wouldn't be allowed to use it, because SLES 11, SP3 is our standard operating system for servers.

bnoordhuis commented 9 years ago

Aw, okay. Do you have a support contract with StrongLoop? I think the logical next step would be to set up an environment identical to yours and see if I can reproduce but I can't justify the expenditure and the hours if you're not a client.

einfallstoll commented 9 years ago

I resolved the issue by rewriting the application to use the official oracledb. It's sad that I had to do this, but it seems like it's working without issues now.