JuliaInterop / Cxx.jl

The Julia C++ Interface
Other
759 stars 107 forks source link

signal (4): Illegal instruction with #14 on Linux and Intel Celeron processor #292

Open ibadr opened 8 years ago

ibadr commented 8 years ago

Cxx v0.0.2 crashes Julia v0.5.0-rc4. The same crash happens on current Cxx master as well. First, the crash happened with Pkg.test("Cxx"), but later I was able to track it down to the test case related to issue #14.

Here's how to reproduce this on a fresh install of Julia v0.5.0-rc4.

Pkg.init()
Pkg.update()
Pkg.clone("Cxx")
Pkg.build("Cxx")

using Cxx
 # Issue # 14
       cxx"""
        class bar14 {
          public:
          double xxx() {
             return 5.0;
          };
        };
       """
b = @cxxnew bar14()
@cxx b->xxx()

This is what I get

signal (4): Illegal instruction
while loading no file, in expression starting on line 0
unknown function (ip: 0x7f69ada54ee9)
unknown function (ip: 0x7f69ada54edc)
cppcall_member at /home/islam/temppkg/v0.5/Cxx/src/codegen.jl:830
jl_call_method_internal at /home/centos/buildbot/slave/package_tarball64/build/src/julia_internal.h:189 [inlined]
jl_apply_generic at /home/centos/buildbot/slave/package_tarball64/build/src/gf.c:1929
do_call at /home/centos/buildbot/slave/package_tarball64/build/src/interpreter.c:66
eval at /home/centos/buildbot/slave/package_tarball64/build/src/interpreter.c:190
jl_toplevel_eval_flex at /home/centos/buildbot/slave/package_tarball64/build/src/toplevel.c:558 [inlined]
jl_toplevel_eval at /home/centos/buildbot/slave/package_tarball64/build/src/toplevel.c:580
jl_toplevel_eval_in_warn at /home/centos/buildbot/slave/package_tarball64/build/src/builtins.c:590
eval at ./boot.jl:234
unknown function (ip: 0x7f6bbae9b14f)
jl_call_method_internal at /home/centos/buildbot/slave/package_tarball64/build/src/julia_internal.h:189 [inlined]
jl_apply_generic at /home/centos/buildbot/slave/package_tarball64/build/src/gf.c:1929
eval_user_input at ./REPL.jl:64
unknown function (ip: 0x7f69ada0f8a6)
jl_call_method_internal at /home/centos/buildbot/slave/package_tarball64/build/src/julia_internal.h:189 [inlined]
jl_apply_generic at /home/centos/buildbot/slave/package_tarball64/build/src/gf.c:1929
macro expansion at ./REPL.jl:95 [inlined]
#3 at ./event.jl:68
unknown function (ip: 0x7f69ada06f2f)
jl_call_method_internal at /home/centos/buildbot/slave/package_tarball64/build/src/julia_internal.h:189 [inlined]
jl_apply_generic at /home/centos/buildbot/slave/package_tarball64/build/src/gf.c:1929
jl_apply at /home/centos/buildbot/slave/package_tarball64/build/src/julia.h:1392 [inlined]
start_task at /home/centos/buildbot/slave/package_tarball64/build/src/task.c:253
unknown function (ip: 0xffffffffffffffff)
Allocations: 4826362 (Pool: 4825247; Big: 1115); GC: 6
Illegal instruction (core dumped)

I suspect this could be related to the old Intel Celeron processor I'm using. Here's my versioninfo.

julia> versioninfo()
Julia Version 0.5.0-rc4+0
Commit 9c76c3e (2016-09-09 01:43 UTC)
Platform Info:
  System: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Celeron(R) CPU B820 @ 1.70GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Nehalem)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.7.1 (ORCJIT, sandybridge)

There's another thing that may be related, simply including <iostream> on the C++> REPL results in a similar crash.

julia> using Cxx

C++ > #include <iostream>
signal (4): Illegal instruction
while loading no file, in expression starting on line 0
unknown function (ip: 0x7fc64cd22876)
unknown function (ip: 0x7fc64cd2262c)
unknown function (ip: 0x7fc64cd225e2)
unknown function (ip: 0x7fc64cd225cc)
unknown function (ip: 0x7fc64cd2143c)
unknown function (ip: 0x7fc64cd2128d)
unknown function (ip: 0x7fc64cd2121c)
isExpressionComplete at /home/islam/temppkg/v0.5/Cxx/src/CxxREPL/replpane.jl:129
#4 at /home/islam/temppkg/v0.5/Cxx/src/CxxREPL/replpane.jl:194
unknown function (ip: 0x7fc64cd21012)
jl_call_method_internal at /home/centos/buildbot/slave/package_tarball64/build/src/julia_internal.h:189 [inlined]
jl_apply_generic at /home/centos/buildbot/slave/package_tarball64/build/src/gf.c:1929
on_enter at ./LineEdit.jl:1267
unknown function (ip: 0x7fc85a18428b)
jl_call_method_internal at /home/centos/buildbot/slave/package_tarball64/build/src/julia_internal.h:189 [inlined]
jl_apply_generic at /home/centos/buildbot/slave/package_tarball64/build/src/gf.c:1929
#92 at ./LineEdit.jl:1351
#13 at ./LineEdit.jl:736
unknown function (ip: 0x7fc64cd113b2)
jl_call_method_internal at /home/centos/buildbot/slave/package_tarball64/build/src/julia_internal.h:189 [inlined]
jl_apply_generic at /home/centos/buildbot/slave/package_tarball64/build/src/gf.c:1929
prompt! at ./LineEdit.jl:1605
run_interface at ./LineEdit.jl:1574
unknown function (ip: 0x7fc85a186e9f)
jl_call_method_internal at /home/centos/buildbot/slave/package_tarball64/build/src/julia_internal.h:189 [inlined]
jl_apply_generic at /home/centos/buildbot/slave/package_tarball64/build/src/gf.c:1929
run_frontend at ./REPL.jl:903
run_repl at ./REPL.jl:188
unknown function (ip: 0x7fc64ccd3532)
jl_call_method_internal at /home/centos/buildbot/slave/package_tarball64/build/src/julia_internal.h:189 [inlined]
jl_apply_generic at /home/centos/buildbot/slave/package_tarball64/build/src/gf.c:1929
_start at ./client.jl:360
unknown function (ip: 0x7fc85a1a1f58)
jl_call_method_internal at /home/centos/buildbot/slave/package_tarball64/build/src/julia_internal.h:189 [inlined]
jl_apply_generic at /home/centos/buildbot/slave/package_tarball64/build/src/gf.c:1929
jl_apply at /home/centos/buildbot/slave/package_tarball64/build/ui/../src/julia.h:1392 [inlined]
true_main at /home/centos/buildbot/slave/package_tarball64/build/ui/repl.c:112
main at /home/centos/buildbot/slave/package_tarball64/build/ui/repl.c:232
__libc_start_main at /build/glibc-GKVZIf/glibc-2.23/csu/../csu/libc-start.c:291
unknown function (ip: 0x4013fc)
Allocations: 5681207 (Pool: 5680058; Big: 1149); GC: 7
Illegal instruction (core dumped)
Keno commented 8 years ago

Can you attacher a debugger and see what the instruction is?

ibadr commented 8 years ago

OK. I'm new to this, so please correct me as needed.

I ran Julia under gdb. Entered the sequence of commands as in #14, then got the following.

julia> @cxx b->xxx()

Thread 1 "julia" received signal SIGILL, Illegal instruction.
0x00007ffde469fcca in ?? ()
(gdb) bt
#0  0x00007ffde469fcca in ?? ()
#1  0x00007ffde469fcbd in ?? ()
#2  0x00007ffde469fcb0 in ?? ()
#3  0x00007ffde469fc58 in ?? ()
#4  0x00007fffffffcf90 in ?? ()
#5  0x00007ffff7fd2190 in ?? ()
#6  0x00007fffffffcf98 in ?? ()
#7  0x00007ffdf6ec7100 in ?? ()
#8  0x00007fffffffcf70 in ?? ()
#9  0x00007ffff7811340 in jl_call_method_internal (nargs=4, 
    args=0x7fffffffcf90, meth=0x7ffdf6ec7100)
    at /home/centos/buildbot/slave/package_tarball64/build/src/julia_internal.h:189
#10 jl_apply_generic (args=0x7fffffffcf90, nargs=<optimized out>)
    at /home/centos/buildbot/slave/package_tarball64/build/src/gf.c:1929
Backtrace stopped: frame did not save the PC

Looking for the disassembly

(gdb) disas 0x00007ffde469fcca,0x00007ffde469fcda
Dump of assembler code from 0x7ffde469fcca to 0x7ffde469fcda:
=> 0x00007ffde469fcca:  vmovsd (%rax),%xmm0
   0x00007ffde469fcce:  retq   
   0x00007ffde469fccf:  pop    %rsp
   0x00007ffde469fcd0:  pop    %r14
   0x00007ffde469fcd2:  pop    %r15
   0x00007ffde469fcd4:  pop    %rbp
   0x00007ffde469fcd5:  retq   
   0x00007ffde469fcd6:  nopw   %cs:0x0(%rax,%rax,1)
End of assembler dump.

So, it seems vmovsd is not supported on my processor?

ibadr commented 8 years ago

And this is for the other case with C++ > REPL

julia> using Cxx

C++ > #include <iostream>
Thread 1 "julia" received signal SIGILL, Illegal instruction.
0x00007ffde669d2a7 in ?? ()
(gdb) disas 0x00007ffde669d2a7,0x00007ffde669d2b7
Dump of assembler code from 0x7ffde669d2a7 to 0x7ffde669d2b7:
=> 0x00007ffde669d2a7:  vxorps %xmm0,%xmm0,%xmm0
   0x00007ffde669d2ab:  vmovups %xmm0,(%rbx)
   0x00007ffde669d2af:  movabs $0x7ffde669d2e0,%r14
End of assembler dump.
ibadr commented 8 years ago

Looking at Intel Intrinsics Guide, it looks like the two instructions in question belong to AVX-512, which is not supported by Sandy Bridge? https://software.intel.com/sites/landingpage/IntrinsicsGuide/

Yet, there are two entries for vxorps that are supposed to be supported by Sandy Bridge. Not the specific one being called here?

Keno commented 8 years ago

vmovsd should be in Sandy Bridge no problem. Can you cat /proc/cpuinfo?

Keno commented 8 years ago

Also, try with latest master, I just pushed a fix which may have affected you.

ibadr commented 8 years ago

Just checked out Cxx master and rebuilt the package from scratch. Unfortunately, the crash still persists at the same instruction (still using Julia v0.5.0-rc4)

julia> using Cxx

julia> # Issue # 14
       cxx"""
        class bar14 {
          public:
          double xxx() {
             return 5.0;
          };
        };
       """
true

julia> b = @cxxnew bar14()
(class bar14 *) @0x0000000006b64470

julia> @cxx b->xxx()

Thread 1 "julia" received signal SIGILL, Illegal instruction.
0x00007ffde46a150a in ?? ()
(gdb) disas 0x00007ffde46a150a,0x00007ffde46a151a
Dump of assembler code from 0x7ffde46a150a to 0x7ffde46a151a:
=> 0x00007ffde46a150a:  vmovsd (%rax),%xmm0
   0x00007ffde46a150e:  retq   
   0x00007ffde46a150f:  pop    %rsp
   0x00007ffde46a1510:  pop    %r14
   0x00007ffde46a1512:  pop    %r15
   0x00007ffde46a1514:  pop    %rbp
   0x00007ffde46a1515:  retq   
   0x00007ffde46a1516:  nopw   %cs:0x0(%rax,%rax,1)
End of assembler dump.

cat /proc/cpuinfo gives

processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model       : 42
model name  : Intel(R) Celeron(R) CPU B820 @ 1.70GHz
stepping    : 7
microcode   : 0x29
cpu MHz     : 981.750
cache size  : 2048 KB
physical id : 0
siblings    : 2
core id     : 0
cpu cores   : 2
apicid      : 0
initial apicid  : 0
fpu     : yes
fpu_exception   : yes
cpuid level : 13
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer xsave lahf_lm epb tpr_shadow vnmi flexpriority ept vpid xsaveopt dtherm arat pln pts
bugs        :
bogomips    : 3392.00
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

processor   : 1
vendor_id   : GenuineIntel
cpu family  : 6
model       : 42
model name  : Intel(R) Celeron(R) CPU B820 @ 1.70GHz
stepping    : 7
microcode   : 0x29
cpu MHz     : 894.890
cache size  : 2048 KB
physical id : 0
siblings    : 2
core id     : 1
cpu cores   : 2
apicid      : 2
initial apicid  : 2
fpu     : yes
fpu_exception   : yes
cpuid level : 13
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer xsave lahf_lm epb tpr_shadow vnmi flexpriority ept vpid xsaveopt dtherm arat pln pts
bugs        :
bogomips    : 3392.00
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:
Keno commented 8 years ago

Indeed, your CPU does not appear to have any sort of AVX. Generally LLVM should pick this up. I'll take a look why it doesn't.