nodejs / node-v0.x-archive

Moved to https://github.com/nodejs/node
34.42k stars 7.31k forks source link

0.8.0 build fails with mksnapshot segfault #3538

Closed herb closed 12 years ago

herb commented 12 years ago

centos 5.4. from v0.8.0 tag in github. building v0.6.19 worked fine. repro steps and gdb backtrace below:

$ CC=/usr/local/bingcc LDFLAGS=-Wl,-R/usr/local/lib ./configure --prefix /packages/encap/node-0.8.0

[...]

$ make

[...]

  CXX(target) /home/herbert/src-nodejs/out/Release/obj.target/v8_nosnapshot/gen/libraries.o
  CXX(target) /home/herbert/src-nodejs/out/Release/obj.target/v8_nosnapshot/gen/experimental-libraries.o
  CXX(target) /home/herbert/src-nodejs/out/Release/obj.target/v8_nosnapshot/deps/v8/src/snapshot-empty.o
  AR(target) /home/herbert/src-nodejs/out/Release/obj.target/deps/v8/tools/gyp/libv8_nosnapshot.a
  CXX(target) /home/herbert/src-nodejs/out/Release/obj.target/mksnapshot/deps/v8/src/mksnapshot.o
  LINK(target) /home/herbert/src-nodejs/out/Release/mksnapshot
  LINK(target) /home/herbert/src-nodejs/out/Release/mksnapshot: Finished
  ACTION v8_snapshot_run_mksnapshot /home/herbert/src-nodejs/out/Release/obj.target/v8_snapshot/geni/snapshot.cc
/bin/sh: line 1:  1142 Segmentation fault      "/home/herbert/src-nodejs/out/Release/mksnapshot" --log-snapshot-positions --logfile "/home/herbert/src-nodejs/out/Release/obj.target/v8_snapshot/geni/snapshot.log" "/home/herbert/src-nodejs/out/Release/obj.target/v8_snapshot/geni/snapshot.cc"
make[1]: *** [/home/herbert/src-nodejs/out/Release/obj.target/v8_snapshot/geni/snapshot.cc] Error 139
make[1]: Leaving directory `/home/herbert/src-nodejs/out'
make: *** [node] Error 2

$ cd out ; gdb /home/herbert/src-nodejs/out/Release/mksnapshot" --log-snapshot-positions --logfile "/home/herbert/src-nodejs/out/Release/obj.target/v8_snapshot/geni/snapshot.log" "/home/herbert/src-nodejs/out/Release/obj.target/v8_
snapshot/geni/snapshot.cc

[...]

(gdb) file /home/herbert/src-nodejs/out/Release/mksnapshot
Reading symbols from /home/herbert/src-nodejs/out/Release/mksnapshot...(no debugging symbols found)...done.
(gdb) run --log-snapshot-positions --logfile "/home/herbert/src-nodejs/out/Release/obj.target/v8_snapshot/geni/snapshot.log" "/home/herbert/src-nodejs/out/Release/obj.target/v8_snapshot/geni/snapshot.cc"
Starting program: /home/herbert/src-nodejs/out/Release/mksnapshot --log-snapshot-positions --logfile "/home/herbert/src-nodejs/out/Release/obj.target/v8_snapshot/geni/snapshot.log" "/home/herbert/src-nodejs/out/Release/obj.target/v8_snapshot/geni/snapshot.cc"
[Thread debugging using libthread_db enabled]

Program received signal SIGSEGV, Segmentation fault.
0x000000000065c28a in v8::internal::ExternalTwoByteString::ExternalTwoByteStringReadBlockIntoBuffer(v8::internal::String::ReadBlockBuffer*, unsigned int*, unsigned int) ()
(gdb) bt
#0  0x000000000065c28a in v8::internal::ExternalTwoByteString::ExternalTwoByteStringReadBlockIntoBuffer(v8::internal::String::ReadBlockBuffer*, unsigned int*, unsigned int) ()
#1  0x000000000065ca88 in v8::internal::String::ReadBlock(v8::internal::String*, v8::internal::String::ReadBlockBuffer*, unsigned int*, unsigned int) ()
#2  0x000000000065cc04 in v8::internal::String::ToCString(v8::internal::AllowNullsFlag, v8::internal::RobustnessFlag, int, int, int*) ()
#3  0x000000000065d0ca in v8::internal::String::ToCString(v8::internal::AllowNullsFlag, v8::internal::RobustnessFlag, int*) ()
#4  0x0000000000624de3 in v8::internal::Isolate::DoThrow(v8::internal::Object*, v8::internal::MessageLocation*) ()
#5  0x0000000000624ee9 in v8::internal::Isolate::Throw(v8::internal::Object*, v8::internal::MessageLocation*) ()
#6  0x00000000006c981a in v8::internal::ThrowRedeclarationError(v8::internal::Isolate*, char const*, v8::internal::Handle<v8::internal::String>) ()
#7  0x00000000006e547e in v8::internal::Runtime_DeclareGlobals(v8::internal::Arguments, v8::internal::Isolate*) ()
#8  0x0000283da6d063ae in ?? ()
#9  0x0000283da6d06301 in ?? ()
#10 0x00007fffffffd720 in ?? ()
#11 0x00007fffffffd770 in ?? ()
#12 0x0000283da6d1a49f in ?? ()
#13 0x0000000200000000 in ?? ()
#14 0x00000d46c3752d39 in ?? ()
#15 0x000015481af1ca11 in ?? ()
#16 0x00000d46c3704121 in ?? ()
#17 0x00000d46c374e4f9 in ?? ()
#18 0x000015481af1ca11 in ?? ()
#19 0x00007fffffffd7a0 in ?? ()
#20 0x0000283da6d19a08 in ?? ()
#21 0x00000d46c3704121 in ?? ()
#22 0x00000d46c3704121 in ?? ()
#23 0x00000d46c374e2b9 in ?? ()
#24 0x000015481af1ca11 in ?? ()
#25 0x00007fffffffd7d8 in ?? ()
#26 0x0000283da6d0cbe7 in ?? ()
#27 0x00000d46c3707371 in ?? ()
#28 0x00000d46c374e2b9 in ?? ()
#29 0x0000283da6d0cb21 in ?? ()
#30 0x0000000600000000 in ?? ()
#31 0x0000000000000000 in ?? ()
(gdb)
bnoordhuis commented 12 years ago

Possibly related to #2912? What does g++ -v say?

bnoordhuis commented 12 years ago

By the way, if you want to use a non-default compiler, make sure that you also pass it to make: make CC=/usr/local/bingcc

herb commented 12 years ago

output from gcc -v:

$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/packages/encap/gcc-4.5.2/bin/../libexec/gcc/x86_64-unknown-linux-gnu/4.5.2/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with: /var/tmp/kyleo/gcc-4.5.2/configure --enable-languages=c,c++,fortran --enable-shared --disable-libstdcxx-pch --enable-lto --enable-libgomp --enable-__cxa_atexit --enable-tls --with-gmp=/var/tmp/gmp-4.3.2 --with-mpfr=/var/tmp/mpfr-3.0.0 --with-mpc=/var/tmp/mpc-0.9 --with-libelf=/var/tmp/libelf-0.8.13
Thread model: posix
gcc version 4.5.2 (GCC) 
$

i actually exported CC but reran make specifying CC explicitly with the same result.

i then ran configure with --without-snapshot and make install failed:

$ make install
make -C out BUILDTYPE=Release
make[1]: Entering directory `/home/herbert/src-nodejs/out'
make[1]: Nothing to be done for `all'.
make[1]: Leaving directory `/home/herbert/src-nodejs/out'
ln -fs out/Release/node node
out/Release/node tools/installer.js install
make: *** [install] Segmentation fault
$ gdb

[...]

(gdb) file out/Release/node
Reading symbols from /home/herbert/src-nodejs/out/Release/node...done.
(gdb) run tools/installer.js install
Starting program: /home/herbert/src-nodejs/out/Release/node tools/installer.js install
[Thread debugging using libthread_db enabled]
[New Thread 0x40010940 (LWP 7007)]

Program received signal SIGSEGV, Segmentation fault.
0x00000000007f330a in v8::internal::ExternalTwoByteString::ExternalTwoByteStringReadBlockIntoBuffer(v8::internal::String::ReadBlockBuffer*, unsigned int*, unsigned int) ()
(gdb) backtrace full
#0  0x00000000007f330a in v8::internal::ExternalTwoByteString::ExternalTwoByteStringReadBlockIntoBuffer(v8::internal::String::ReadBlockBuffer*, unsigned int*, unsigned int) ()
No symbol table info available.
#1  0x00000000007f3b08 in v8::internal::String::ReadBlock(v8::internal::String*, v8::internal::String::ReadBlockBuffer*, unsigned int*, unsigned int) ()
No symbol table info available.
#2  0x00000000007f3c84 in v8::internal::String::ToCString(v8::internal::AllowNullsFlag, v8::internal::RobustnessFlag, int, int, int*) ()
No symbol table info available.
#3  0x00000000007f414a in v8::internal::String::ToCString(v8::internal::AllowNullsFlag, v8::internal::RobustnessFlag, int*) ()
No symbol table info available.
#4  0x00000000007bbe63 in v8::internal::Isolate::DoThrow(v8::internal::Object*, v8::internal::MessageLocation*) ()
No symbol table info available.
#5  0x00000000007bbf69 in v8::internal::Isolate::Throw(v8::internal::Object*, v8::internal::MessageLocation*) ()
No symbol table info available.
#6  0x000000000086089a in v8::internal::ThrowRedeclarationError(v8::internal::Isolate*, char const*, v8::internal::Handle<v8::internal::String>) ()
No symbol table info available.
#7  0x000000000087c4fe in v8::internal::Runtime_DeclareGlobals(v8::internal::Arguments, v8::internal::Isolate*) ()
No symbol table info available.
#8  0x00001393b1106362 in ?? ()
No symbol table info available.
#9  0x00001393b11062c1 in ?? ()
No symbol table info available.
#10 0x00007fffffffd910 in ?? ()
No symbol table info available.
#11 0x00007fffffffd960 in ?? ()
No symbol table info available.
#12 0x00001393b111a05c in ?? ()
No symbol table info available.
#13 0x0000000200000000 in ?? ()
No symbol table info available.
#14 0x00001e54a7f53169 in ?? ()
No symbol table info available.
#15 0x00000967a8f1ca11 in ?? ()
No symbol table info available.
#16 0x00001e54a7f04121 in ?? ()
No symbol table info available.
#17 0x00001e54a7f4e929 in ?? ()
No symbol table info available.
#18 0x00000967a8f1ca11 in ?? ()
No symbol table info available.
#19 0x00007fffffffd990 in ?? ()
No symbol table info available.
#20 0x00001393b111961f in ?? ()
No symbol table info available.
#21 0x00001e54a7f04121 in ?? ()
No symbol table info available.
#22 0x00001e54a7f04121 in ?? ()
No symbol table info available.
#23 0x00001e54a7f4e6e9 in ?? ()
No symbol table info available.
#24 0x00000967a8f1ca11 in ?? ()
No symbol table info available.
#25 0x00007fffffffd9c8 in ?? ()
No symbol table info available.
#26 0x00001393b110c8a7 in ?? ()
No symbol table info available.
#27 0x00001e54a7f07371 in ?? ()
No symbol table info available.
#28 0x00001e54a7f4e6e9 in ?? ()
No symbol table info available.
#29 0x00001393b110c7e1 in ?? ()
No symbol table info available.
#30 0x0000000600000000 in ?? ()
No symbol table info available.
#31 0x0000000000000000 in ?? ()
No symbol table info available.
(gdb)
bnoordhuis commented 12 years ago

output from gcc -v

Is that the same gcc as /usr/local/bingcc?

Do 32 bits build work? You can build one with ./configure --dest-cpu=ia32.

herb commented 12 years ago

yes. using encap to install so /usr/local/bin/gcc is just a symlink to that gcc binary:

$ l /usr/local/bin/gcc
lrwxrwxrwx 1 root root 32 Feb  8 15:58 /usr/local/bin/gcc -> ../../../encap/gcc-4.5.2/bin/gcc*

building 32-bit resulted in a different segfault. :)

$ CC=/usr/local/bin/gcc LDFLAGS=-Wl,-R/usr/local/lib ./configure --prefix /packages/encap/node-0.8.0 --without-snapshot --dest-cpu=ia32

[...]

$ make install
make -C out BUILDTYPE=Release
make[1]: Entering directory `/home/herbert/src-nodejs/out'
make[1]: Nothing to be done for `all'.
make[1]: Leaving directory `/home/herbert/src-nodejs/out'
ln -fs out/Release/node node
out/Release/node tools/installer.js install
make: *** [install] Segmentation fault
$
$ gdb

[...]

(gdb) file out/Release/node
Reading symbols from /home/herbert/src-nodejs/out/Release/node...done.
(gdb) run tools/installer.js install
Starting program: /home/herbert/src-nodejs/out/Release/node tools/installer.js install
[Thread debugging using libthread_db enabled]
[New Thread 0xf7d3fb90 (LWP 25239)]

Program received signal SIGSEGV, Segmentation fault.
0x08439278 in v8::internal::ExternalTwoByteString::ExternalTwoByteStringReadBlockIntoBuffer(v8::internal::String::ReadBlockBuffer*, unsigned int*, unsigned int) ()
(gdb) bt
#0  0x08439278 in v8::internal::ExternalTwoByteString::ExternalTwoByteStringReadBlockIntoBuffer(v8::internal::String::ReadBlockBuffer*, unsigned int*, unsigned int) ()
#1  0xffffc9d0 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(gdb) backtrace full
#0  0x08439278 in v8::internal::ExternalTwoByteString::ExternalTwoByteStringReadBlockIntoBuffer(v8::internal::String::ReadBlockBuffer*, unsigned int*, unsigned int) ()
No symbol table info available.
#1  0xffffc9d0 in ?? ()
No symbol table info available.
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(gdb)
bnoordhuis commented 12 years ago

Okay, let's try two more things:

  1. What happens if you set v8_no_strict_aliasing% to 1 in deps/v8/build/common.gypi?
  2. Does the debug build (make -C out BUILDTYPE=Debug) work? You'll find it in out/Debug/node.
herb commented 12 years ago

by setting v8_no_strict_aliasing to 1 i can make and install. unclear if it's a working installation though. make test gives a series of similar errors:

Error: accept EINVAL   
    at errnoException (net.js:781:11)
    at TCP.onconnection (net.js:1027:24)
Command: out/Release/node /home/herbert/src-nodejs/test/simple/test-http-many-keep-alive-connections.js

and then dies:

=== release test-net-binary ===
Path: simple/test-net-binary
'\377' "ÿ" "ÿ" 255
'\376' "þ" "þ" 254
'\375' "ý" "ý" 253
'\374' "ü" "ü" 252

[...]

'\104' "D" "close failed in file object destructor:
sys.excepthook is missing
lost sys.stderr
make: *** [test] Error 1
$

i'll give a debug build a whirl and post results back here.

bnoordhuis commented 12 years ago

@herb Can you post the output of the following?

herb commented 12 years ago
$ uname -a
Linux herbert2 2.6.18-164.el5 #1 SMP Thu Sep 3 03:28:30 EDT 2009 x86_64 GNU/Linux
$ $ strace -fe accept,accept4 out/Release/node test/simple/test-http-many-keep-alive-connections.js
[ Process PID=15465 runs in 32 bit mode. ]
Process 15466 attached
Process 15467 attached
[pid 15465] accept4(8, 0, NULL, 0x80800 /* SOCK_??? */) = -1 EINVAL (Invalid argument)

events.js:66
        throw arguments[1]; // Unhandled 'error' event
                       ^
Error: accept EINVAL
    at errnoException (net.js:781:11)
    at TCP.onconnection (net.js:1027:24)
$

so we're on the same page, how i'm configuring this:

$ CC=/usr/local/bin/gcc LDFLAGS=-Wl,-R/usr/local/lib ./configure --prefix /packages/encap/node-0.8.0 --without-snapshot --dest-cpu=ia32

and the debug build seems to work. i was able to read a simple text file with fs.readFileSync. unclear how else to verify it's operation.

bnoordhuis commented 12 years ago

I would like to narrow down the cause of that EINVAL error. Does the test pass if you apply the patch below?

diff --git a/deps/uv/src/unix/core.c b/deps/uv/src/unix/core.c
index 318eb71..f4d5419 100644
--- a/deps/uv/src/unix/core.c
+++ b/deps/uv/src/unix/core.c
@@ -425,7 +425,7 @@ int uv__accept(int sockfd) {
   assert(sockfd >= 0);

   while (1) {
-#if __linux__
+#if 0
     peerfd = uv__accept4(sockfd,
                          NULL,
                          NULL,

What about this one? (Make sure you revert the previous one.)

diff --git a/deps/uv/src/unix/core.c b/deps/uv/src/unix/core.c
index 318eb71..958b1c2 100644
--- a/deps/uv/src/unix/core.c
+++ b/deps/uv/src/unix/core.c
@@ -429,7 +429,7 @@ int uv__accept(int sockfd) {
     peerfd = uv__accept4(sockfd,
                          NULL,
                          NULL,
-                         UV__SOCK_NONBLOCK|UV__SOCK_CLOEXEC);
+                         0);

     if (peerfd != -1)
       break;
bnoordhuis commented 12 years ago

By the way, I addressed the -fstrict-aliasing thing in 07e5877.

herb commented 12 years ago

the first patch fixes it. the second patch does not.

though there are still four failed tests after applying the first patch.

nebjak commented 12 years ago

Same issue, but non of your patches fixes it :(

ACTION v8_snapshot_run_mksnapshot /home/nebjak/Downloads/node-v0.8.0/out/Release/obj.target/v8_snapshot/geni/snapshot.cc
/bin/sh: line 1:   893 Segmentation fault      "/home/nebjak/Downloads/node-v0.8.0/out/Release/mksnapshot" --log-snapshot-positions --logfile "/home/nebjak/Downloads/node-v0.8.0/out/Release/obj.target/v8_snapshot/geni/snapshot.log" "/home/nebjak/Downloads/node-v0.8.0/out/Release/obj.target/v8_snapshot/geni/snapshot.cc"
make[1]: *** [/home/nebjak/Downloads/node-v0.8.0/out/Release/obj.target/v8_snapshot/geni/snapshot.cc] Error 139
make[1]: Leaving directory `/home/nebjak/Downloads/node-v0.8.0/out'
make: *** [node] Error 2

uname -a

Linux electron 2.6.37.6-0.11-desktop #1 SMP PREEMPT 2011-12-19 23:39:38 +0100 x86_64 x86_64 x86_64 GNU/Linux

g++ -v

Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/usr/lib64/gcc/x86_64-suse-linux/4.5/lto-wrapper
Target: x86_64-suse-linux
Configured with: ../configure --prefix=/usr --infodir=/usr/share/info --mandir=/usr/share/man --libdir=/usr/lib64 --libexecdir=/usr/lib64 --enable-languages=c,c++,objc,fortran,obj-c++,java,ada --enable-checking=release --with-gxx-include-dir=/usr/include/c++/4.5 --enable-ssp --disable-libssp --disable-plugin --with-bugurl=http://bugs.opensuse.org/ --with-pkgversion='SUSE Linux' --disable-libgcj --disable-libmudflap --with-slibdir=/lib64 --with-system-zlib --enable-__cxa_atexit --enable-libstdcxx-allocator=new --disable-libstdcxx-pch --enable-version-specific-runtime-libs --program-suffix=-4.5 --enable-linux-futex --without-system-libunwind --enable-gold --with-plugin-ld=/usr/bin/gold --with-arch-32=i586 --with-tune=generic --build=x86_64-suse-linux
Thread model: posix
gcc version 4.5.1 20101208 [gcc-4_5-branch revision 167585] (SUSE Linux)
bnoordhuis commented 12 years ago

@nebjak Can you try the v0.8 or master git branch?

bnoordhuis commented 12 years ago

the first patch fixes it. the second patch does not.

@herb Do you run a stock kernel or is it one you compiled yourself (with or without custom patches)?

herb commented 12 years ago

it's stock.

$ cat /proc/version 
Linux version 2.6.18-164.el5 (mockbuild@builder10.centos.org) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-46)) #1 SMP Thu Sep 3 03:28:30 EDT 2009
bnoordhuis commented 12 years ago

@herb Is it possible for you to get me shell access (with gcc and git) on the affected machine? I need to figure out why that syscall is failing. My pubkey is here.

nebjak commented 12 years ago

@bnoordhuis both braches (master, v0.8) fail to build with same errors. Also, v0.6.19 builds without problems. With ./configure --without-snapshot make passes, but returns Segmentation fault on make install

superbob commented 12 years ago

I'm currently experiencing the same issues. (same errors in the same contexts) I've tried the default install, the install using ./configure --without-snapshot using the install tarball from http://nodejs.org/dist/v0.8.0/node-v0.8.0.tar.gz. Didn't try building from branches nor using given patches. 0.6 builds fine. I'm on openSUSE 11.4 (Linux 2.6.37.6-0.11) on an x86-64 arch.

bnoordhuis commented 12 years ago

Can someone please test this patch against the v0.8 branch?

diff --git a/configure b/configure
index 8dd6884..11e21a8 100755
--- a/configure
+++ b/configure
@@ -254,7 +254,8 @@ def compiler_version():
   version = version_line.split("version")[1].strip().split()[0].split(".")
   if not version:
     return (False, False, None)
-  return ('LLVM' in version_line, 'clang' in CC, tuple(version))
+  version = tuple(map(int, version))
+  return ('LLVM' in version_line, 'clang' in CC, version)

 def configure_node(o):
   # TODO add gdb
@@ -270,12 +271,9 @@ def configure_node(o):
   # turn off strict aliasing if gcc < 4.6.0 unless it's llvm-gcc
   # see http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45883
   # see http://code.google.com/p/v8/issues/detail?id=884
-  o['variables']['strict_aliasing'] = b(is_clang or cc_version >= (4,6,0))
-
-  # disable strict aliasing in V8 if we're compiling with gcc 4.5.x,
-  # it makes V8 crash in various ways
-  o['variables']['v8_no_strict_aliasing'] = b(
-    not is_clang and (4,5,0) <= cc_version < (4,6,0))
+  strict_aliasing = is_clang or cc_version >= (4,6,0)
+  o['variables']['strict_aliasing'] = b(strict_aliasing)
+  o['variables']['v8_no_strict_aliasing'] = b(not strict_aliasing)

   # clang has always supported -fvisibility=hidden, right?
   if not is_clang and cc_version < (4,0,0):
fgalan commented 12 years ago

@bnoordhuis Same problem.

Trace (just in case I'm missing something)

[root@centollo node]# git branch
  master
* v0.8
[root@centollo node]# git show-branch
! [master] Now working on 0.9.0
 * [v0.8] Added % difference for perf benchmarks in 0.8 post
--
 * [v0.8] Added % difference for perf benchmarks in 0.8 post
 * [v0.8^] configure: don't fail if compiler_version() doesn't work
 * [v0.8~2] doc: `detached` is a boolean
 * [v0.8~3] build: expand ~ in `./configure --prefix=~/a/b/c`
+  [master] Now working on 0.9.0
+  [master^] Fix #3521 Use an object as the process.env proto
+  [master~2] build: enable strict aliasing if gcc < 4.5.0
+* [v0.8~4] build: disable strict aliasing in v8 with gcc 4.5.x
[root@centollo node]# git diff
diff --git a/configure b/configure
index 338ec3c..05c6c70 100755
--- a/configure
+++ b/configure
@@ -254,7 +254,8 @@ def compiler_version():
   version = version_line.split("version")[1].strip().split()[0].split(".")
   if not version:
     return (False, False, None)
-  return ('LLVM' in version_line, 'clang' in CC, tuple(version))
+  version = tuple(map(int, version))
+  return ('LLVM' in version_line, 'clang' in CC, version)

 def configure_node(o):
   # TODO add gdb
@@ -270,12 +271,9 @@ def configure_node(o):
   # turn off strict aliasing if gcc < 4.6.0 unless it's llvm-gcc
   # see http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45883
   # see http://code.google.com/p/v8/issues/detail?id=884
-  o['variables']['strict_aliasing'] = b(is_clang or cc_version >= (4,6,0))
-
-  # disable strict aliasing in V8 if we're compiling with gcc 4.5.x,
-  # it makes V8 crash in various ways
-  o['variables']['v8_no_strict_aliasing'] = b(
-    not is_clang and (4,5,0) <= cc_version < (4,6,0))
+  strict_aliasing = is_clang or cc_version >= (4,6,0)
+  o['variables']['strict_aliasing'] = b(strict_aliasing)
+  o['variables']['v8_no_strict_aliasing'] = b(not strict_aliasing)

   # clang has always supported -fvisibility=hidden, right?
   if not is_clang and cc_version < (4,0,0):
[root@centollo node]# make clean
rm -rf out/Makefile node node_g out/Release/node blog.html email.md
find out/ -name '*.o' -o -name '*.a' | xargs rm -rf
rm -rf node_modules
[root@centollo node]# ./configure
{ 'target_defaults': { 'cflags': [],
                       'default_configuration': 'Release',
                       'defines': [],
                       'include_dirs': [],
                       'libraries': []},
  'variables': { 'host_arch': 'x64',
                 'node_install_npm': 'true',
                 'node_install_waf': 'true',
                 'node_prefix': '',
                 'node_shared_openssl': 'false',
                 'node_shared_v8': 'false',
                 'node_shared_zlib': 'false',
                 'node_use_dtrace': 'false',
                 'node_use_etw': 'false',
                 'node_use_openssl': 'true',
                 'strict_aliasing': 'false',
                 'target_arch': 'x64',
                 'v8_no_strict_aliasing': 'true',
                 'v8_use_snapshot': 'true',
                 'visibility': ''}}
creating  ./config.gypi
creating  ./config.mk
[root@centollo node]# make 
[...]
  ACTION v8_snapshot_run_mksnapshot /tmp/node/out/Release/obj.target/v8_snapshot/geni/snapshot.cc
pure virtual method called
terminate called without an active exception
/bin/sh: línea 1:  5094 Abortado                (`core' generado) "/tmp/node/out/Release/mksnapshot" --log-snapshot-positions --logfile "/tmp/node/out/Release/obj.target/v8_snapshot/geni/snapshot.log" "/tmp/node/out/Release/obj.target/v8_snapshot/geni/snapshot.cc"
make[1]: *** [/tmp/node/out/Release/obj.target/v8_snapshot/geni/snapshot.cc] Error 134
make[1]: se sale del directorio `/tmp/node/out'
make: *** [node] Error 2
bnoordhuis commented 12 years ago

@fgalan Thanks for testing that. I confess I'm kind of stumped. Can you get me a shell account on that machine?

herb commented 12 years ago

sorry @bnoordhuis, the machine i'm currently dev'ing on is a VM on my mac. getting a public box with the same stack is fraught with the usual corp nonsense. i'm open to a hangout to share a screen or some hints on where to look so i can do some of my own debugging for you.

i've compiled the v0.8 branch (HEAD at 0cdeb8e) and can build and install. my existing code base seems to work ok.

tests fail at 'test-child-process-fork2'

CLIENT closed 9
CLIENT closed 10
assert.js:104
  throw new assert.AssertionError({
        ^
AssertionError: false == true  
    at process.<anonymous> (/home/herbert/src-nodejs/test/simple/test-child-process-fork2.js:71:10)
    at process.EventEmitter.emit (events.js:115:20)
Command: out/Release/node /home/herbert/src-nodejs/test/simple/test-child-process-fork2.js

which seems to be before the previous errors.

bnoordhuis commented 12 years ago

@herb Some tests are fairly timing sensitive and running them in a VM often exacerbates that. If it's only a handful of tests that fail, just ignore it (within reason).

herb commented 12 years ago

gotcha. the test failing actually prevents the other tests from running.

running the previous failing test (test/simple/test-http-many-keep-alive-connections.js) directly doesn't error as before. so it looks like this bug is closed from my POV.

i'll leave it open for @fgalan but feel free to close it when you feel appropriate.

edited: i left out something important, this was with the patch to configure attached above.

Mantic commented 12 years ago

So... not sure if my problem is related to this issue, but it seems to produce the exact same error in one of your comments above:

events.js:66
        throw arguments[1]; // Unhandled 'error' event
                       ^
Error: accept EINVAL   
    at errnoException (net.js:781:11)
    at TCP.onconnection (net.js:1027:24)

This occurs once the server is up and running and as the first connection comes in with just the standard example:

var http = require('http');
http.createServer(function (req, res) {
  res.writeHead(200, {'Content-Type': 'text/plain'});
  res.end('Hello World\n');
}).listen(15915, '127.0.0.1');
console.log('Server running at http://127.0.0.1:15915/');

I'm on a shared host:

uname -a && cat /etc/*release:

Linux web187.webfaction.com 2.6.18-274.17.1.el5PAE #1 SMP Tue Jan 10 18:05:48 EST 2012 i686 i686 i386 GNU/Linux
CentOS release 5.8 (Final)

This is a fresh install of 0.8.0 from the source code link found on nodejs.org.

bnoordhuis commented 12 years ago

@Mantic can you open a new issue for that?

Mantic commented 12 years ago

Done: #3566

fgalan commented 12 years ago

@bnoordhuis I can not give you access to the system where I got the problem, but I will try tro reproduce it in a CentOS 6.2 intance at Amazon EC2 and provide you SSH access to it. Thanks!

PAStheLoD commented 12 years ago

Hello,

I've the same mksnapshot segfault issue on v0.8.1-release, and on master (ba0efd6de0628a835e77ab490ecfb5e53ab71273), it's not a VM, it's a puny old AMD machine with 2.6.38.4-grsec x86_64. (Though I just get a regular segfault not a PAX security one ([37034928.522036] mksnapshot[10989]: segfault at 20 ip 00000000007cb5c4 sp 000003ff102000c0 error 4 in mksnapshot[400000+501000])

Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/4.7/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 4.7.0-8' --with-bugurl=file:///usr/share/doc/gcc-4.7/README.Bugs --enable-languages=c,c++,go,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.7 --enable-shared --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.7 --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-gnu-unique-object --enable-plugin --enable-objc-gc --with-arch-32=i586 --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 4.7.0 (Debian 4.7.0-8) 

Though I can build master on an other Debian Squeeze VM (also 64-bit) with g++ 4.4.5. (But there the === release test-child-process-fork2 === and === release test-eio-race === tests fail, the first one with an assertion the second one can't kill the event loop and the test timeouts.)

fgalan commented 12 years ago

Hi,

When preparing the sandbox for @bnoordhuis, trying to reproduce the bug in a CentOS 6.2 instance at EC2, I have realized (for my big surprise) that in that evironment the node-0.8.0.tar.gz is built without problems.

Very annoying, considering that the distribution and version is exactly the same than in my buggy system. I compared both environment and observed a difference in the config.gypi fyle:

buggy system:

{ 'target_defaults': { 'cflags': [],
                       'default_configuration': 'Release',
                       'defines': [],
                       'include_dirs': [],
                       'libraries': []},
  'variables': { 'host_arch': 'x64',
                 'node_install_npm': 'true',
                 'node_install_waf': 'true',
                 'node_prefix': '',
                 'node_shared_openssl': 'false',
                 'node_shared_v8': 'false',
                 'node_shared_zlib': 'false',
                 'node_use_dtrace': 'false',
                 'node_use_etw': 'false',
                 'node_use_openssl': 'true',
                 'strict_aliasing': 'false',
                 'target_arch': 'x64',
                 'v8_use_snapshot': 'true',
                 'visibility': ''}}

EC2 instance:

{ 'target_defaults': { 'cflags': [],
                       'default_configuration': 'Release',
                       'defines': [],
                       'include_dirs': [],
                       'libraries': []},
  'variables': { 'host_arch': 'x64',
                 'node_install_npm': 'true',
                 'node_install_waf': 'true',
                 'node_prefix': '',
                 'node_shared_openssl': 'false',
                 'node_shared_v8': 'false',
                 'node_shared_zlib': 'false',
                 'node_use_dtrace': 'false',
                 'node_use_etw': 'false',
                 'node_use_openssl': 'true',
                 'strict_aliasing': 'true',
                 'target_arch': 'x64',
                 'v8_no_strict_aliasing': 'false',
                 'v8_use_snapshot': 'true'}}

The only difference is strict_aliasing flag, which is false in the buggy system (causing the compilation fail) and true in the EC2 instance (compiling sucessfully).

So, looking into configure, I see that this depends on the ouput of "gcc -v", so comparing both:

buggy system:

Usando especificaciones internas.
Objetivo: x86_64-redhat-linux
Configurado con: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-bootstrap --enable-shared --enable-threads=posix --enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-languages=c,c++,objc,obj-c++,java,fortran,ada --enable-java-awt=gtk --disable-dssi --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre --enable-libgcj-multifile --enable-java-maintainer-mode --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --disable-libjava-multilib --with-ppl --with-cloog --with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux
Modelo de hilos: posix
gcc versión 4.4.6 20110731 (Red Hat 4.4.6-3) (GCC) 

EC2 instance:

Using built-in specs.
Target: x86_64-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-bootstrap --enable-shared --enable-threads=posix --enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-languages=c,c++,objc,obj-c++,java,fortran,ada --enable-java-awt=gtk --disable-dssi --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre --enable-libgcj-multifile --enable-java-maintainer-mode --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --disable-libjava-multilib --with-ppl --with-cloog --with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux
Thread model: posix
gcc version 4.4.6 20110731 (Red Hat 4.4.6-3) (GCC)

They are the same except from the first one is localized in es_ES and the second one in en_GB. In particular, in the buggy system the last line reads versión instead of version which makes the cc_version() function to fail.

The following patch solves the problem:

--- a/configure 2012-06-25 16:37:20.000000000 +0200
+++ b/configure.fix     2012-06-29 23:18:00.834233335 +0200
@@ -1,4 +1,5 @@
 #!/usr/bin/env python
+# -*- coding: latin-1 -*-
 import optparse
 import os
 import pprint
@@ -247,11 +248,11 @@
   lines = proc.communicate()[1].split('\n')
   version_line = None
   for i, line in enumerate(lines):
-    if 'version' in line:
+    if 'versión' in line:
       version_line = line
   if not version_line:
     return None
-  version = version_line.split("version")[1].strip().split()[0].split(".")
+  version = version_line.split("versión")[1].strip().split()[0].split(".")
   if not version:
     return None
   return ['LLVM' in version_line] + version

However, it is completelly ad hoc for my parcicular operating system locale. I think that the definitive fix should be a more powerfull strategy parsing the gcc -v output, agnostic of any locale. I'm available if you want to test this definitive solution in a es_ES localized system.

Thanks for the help!


Fermín

bnoordhuis commented 12 years ago

Hah, that's awesome. I wouldn't in a thousand years have considered localized versions (and I'm not a native English speaker myself). Thanks for figuring that out @fgalan, I've landed a fix in f78ce08.

lpinca commented 12 years ago

I cloned master branch but i still have the issue. Running configure with default options will lead to this: https://gist.github.com/3027786 With ./configure --without-snapshot make passes, but returns Segmentation fault on make install: https://gist.github.com/3027773

bnoordhuis commented 12 years ago

@lpinca Try the v0.8 branch, it's in better shape right now. Haven't really had time to update master yet.

lpinca commented 12 years ago

With v0.8 branch i get this:

[luigi@fedora node]$ ./configure
Traceback (most recent call last):
  File "./configure", line 398, in <module>
    configure_node(output)
  File "./configure", line 293, in configure_node
    cc_version, is_clang = compiler_version()
  File "./configure", line 278, in compiler_version
    'include the output of `%s --version`' % CC)
Exception: Unknown compiler. Please open an issue at https://github.com/joyent/node/issues and include the output of `cc --version`

[luigi@fedora node]$ cc --version
cc (GCC) 4.5.1 20100924 (Red Hat 4.5.1-4)
Copyright (C) 2010 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
bnoordhuis commented 12 years ago

Right, that's the result of Red Hat's branding of gcc. Can you try the patch from #3601?

lpinca commented 12 years ago

Tested on a CentOS 6 and worked, i'll try on my Fedora when i get back home. Thank you.

Edit: Worked also on Fedora.

superbob commented 12 years ago

I successfully built (make) with --without-snapshot (seg fault without the option). Anyway, "make install fails" on "out/Release/node tools/installer.js install" (seg fault). I'm using source v0.8.1 package on the main nodejs website. I will try building from git repo.

lpinca commented 12 years ago

@superbob Use the v0.8 branch as bnoordhuis suggested.

superbob commented 12 years ago

@lpinca I will try it as soon as a can pass trough my proxy ... I've some problems connecting to github ... Thanks for the advice anyway :)

superbob commented 12 years ago

It's ok for me with v0.8.2 archive. (Could not try the branch because of my corporate proxy). Thanks a lot.