JuliaLang / IJulia.jl

Julia kernel for Jupyter
MIT License
2.8k stars 413 forks source link

IJulia kernel dies instantly on OS X #138

Closed cossatot closed 10 years ago

cossatot commented 10 years ago

Hi,

I just installed Julia fresh from master (Version 0.3.0-prerelease+1277, commit 89d32b2*), and added the IJulia Pkg. When I fire up the IJulia notebook, I get a message saying that the kernel has died and will automatically restart (5 times), followed by a 'Dead Kernel' message prompting for a manual restart. This just repeats the process.

This is on OS X Mountain lion, with IPython 1.1.0, Anaconda Python 2.7.6, all up to date. Normal IPython works fine day in day out. Julia also works fine in the terminal, although I haven't tested it exhaustively--I just installed it.

Please let me know if there is any additional information I can provide.

jiahao commented 10 years ago

Could you please provide the output of versioninfo(true) from the Julia terminal REPL.

stevengj commented 10 years ago

Were there any errors in the output when you first did Pkg.add("IJulia")`?

cossatot commented 10 years ago

Output of versioninfo(true):

julia> versioninfo(true)
Julia Version 0.3.0-prerelease+1277
Commit 89d32b2* (2014-01-28 00:10 UTC)
Platform Info:
  System: Darwin (x86_64-apple-darwin12.5.0)
  CPU: Intel(R) Core(TM) i7-3615QM CPU @ 2.30GHz
  WORD_SIZE: 64
  uname: Darwin 12.5.0 Darwin Kernel Version 12.5.0: Sun Sep 29 13:33:47 PDT 2013; root:xnu-2050.48.12~1/RELEASE_X86_64 x86_64 i386
Memory: 8.0 GB (226.48046875 MB free)
Uptime: 562363.0 sec
Load Avg:  1.15283  1.00342  2.12158
Intel(R) Core(TM) i7-3615QM CPU @ 2.30GHz:
       speed         user         nice          sys         idle          irq
#1  2300 MHz     192551 s          0 s     231441 s    2943054 s          0 s
#2  2300 MHz      26911 s          0 s       5887 s    3334233 s          0 s
#3  2300 MHz     188977 s          0 s     125834 s    3052223 s          0 s
#4  2300 MHz      25163 s          0 s       5088 s    3336780 s          0 s
#5  2300 MHz     173356 s          0 s     113192 s    3080485 s          0 s
#6  2300 MHz      23760 s          0 s       4646 s    3338624 s          0 s
#7  2300 MHz     169512 s          0 s     107778 s    3089742 s          0 s
#8  2300 MHz      22865 s          0 s       4413 s    3339752 s          0 s

  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY)
  LAPACK: libopenblas
  LIBM: libopenlibm
Environment:
  DYLD_FALLBACK_LIBRARY_PATH = /Users/itchy/src/anaconda/lib:
  TERM = xterm-256color
  CDPATH = .:/Users/itchy:/Users/itchy/school/tibet/stress/fault_elev_stress
  PATH = /Users/itchy/gits/julia:/usr/local/bin:/Users/itchy/src/anaconda/bin:/Library/Frameworks/GDAL.framework/Programs:/Developer/NVIDIA/CUDA-5.0/bin:/Applications/MacVim.app/Contents/MacOS/Vim:/opt/local/lib/openmpi/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/opt/X11/bin:/usr/texbin
  HOME = /Users/itchy
  DYLD_LIBRARY_PATH = /Developer/NVIDIA/CUDA-5.0/lib:
  PYTHONPATH = /Users/itchy/src/anaconda/bin:/Users/itchy/school/halfspace:/Users/itchy/school/vtk_tools:::/usr/local/lib/python2.7/site-packages

Package Directory: /Users/itchy/.julia
3 required packages:
 - Compose                       0.1.24
 - IJulia                        0.1.1
 - PyPlot                        1.2.0
10 additional packages:
 - BinDeps                       0.2.12
 - Color                         0.2.8
 - Homebrew                      0.0.4
 - Iterators                     0.1.1
 - JSON                          0.3.3
 - Nettle                        0.1.3
 - PyCall                        0.4.1
 - REPLCompletions               0.0.0
 - URIParser                     0.0.1
 - ZMQ                           0.1.7
cossatot commented 10 years ago

No, I didn't get any errors from Pkg.add("IJulia"). I will delete the .julia dir and re-install packages (I would have tried this already; I did a Julia re-install, but last night I didn't see that the packages were kept separately from the install dir).

cossatot commented 10 years ago

OK, I tried deleting and re-installing the packages, and am getting the same result.

stevengj commented 10 years ago

Can you try with Julia 0.2 in case there is a problem with the 0.3 prerelease snapshot you are using? (Not that you will need to run Pkg.build("IJulia") to reconfigure IJulia in order to switch to a new Julia installation.) I just tried MacOS 10.9 with a fresh Julia 0.2 installation and it worked fine.

cossatot commented 10 years ago

I tried with v.0.2.0 both built from source and with the OS X binary downloaded from Julialang.org. I still get the same result.

aelg commented 10 years ago

I have got the exact same problem on Archlinux. Julia is built from AUR (julia-git) IPython is built from Archs repositories version is IPython 1.1.0

Output from julia> versioninfo(true):

Julia Version 0.3.0-prerelease+1316
Commit d77f95a* (2014-01-30 05:22 UTC)
Platform Info:
  System: Linux (x86_64-unknown-linux-gnu)
  CPU: AMD E-350 Processor
  WORD_SIZE: 64
  uname: Linux 3.12.9-1-ARCH #1 SMP PREEMPT Sun Jan 26 09:01:37 CET 2014 x86_64 unknown
Memory: 3.492786407470703 GB (1171.9296875 MB free)
Uptime: 90295.682538381 sec
Load Avg:  1.35059  1.38428  1.26807
AMD E-350 Processor: 
       speed         user         nice          sys         idle          irq
#1  1280 MHz     584845 s      76575 s     162906 s    4465321 s          2 s
#2  1600 MHz     531260 s      79754 s     164430 s     819352 s          0 s

  BLAS: libblas
  LAPACK: liblapack
  LIBM: libm
Environment:
  GLADE_PIXMAP_PATH = :
  TERM = screen
  GLADE_MODULE_PATH = :
  MOZ_PLUGIN_PATH = /usr/lib/mozilla/plugins
  PATH = /usr/local/sbin:/usr/local/bin:/usr/bin:/usr/bin/vendor_perl:/usr/bin/core_perl:/home/aelg/bin:/home/aelg/bin
  JAVA_HOME = /usr/lib/jvm/java-7-openjdk
  HOME = /home/aelg
  GLADE_CATALOG_PATH = :

Package Directory: /home/aelg/.julia
3 required packages:
 - Cairo                         0.2.12
 - Gadfly                        0.1.31
 - IJulia                        0.1.1
23 additional packages:
 - ArrayViews                    0.2.5
 - BinDeps                       0.2.12
 - Blocks                        0.0.1
 - Codecs                        0.1.0
 - Color                         0.2.8
 - Compose                       0.1.24
 - DataArrays                    0.1.1
 - DataFrames                    0.5.1
 - Datetime                      0.1.2
 - Distance                      0.3.0
 - Distributions                 0.3.0
 - GZip                          0.2.7
 - Hexagons                      0.0.1
 - Iterators                     0.1.1
 - JSON                          0.3.3
 - Loess                         0.0.2
 - Nettle                        0.1.3
 - NumericExtensions             0.4.1
 - REPLCompletions               0.0.0
 - SortingAlgorithms             0.0.1
 - StatsBase                     0.3.5
 - URIParser                     0.0.1
 - ZMQ                           0.1.8

I have not tried Julia 0.2

stevengj commented 10 years ago

Try changing verbose to true in ~/.julia/IJulia/src/IJulia.jl, and see if there is any more information printed to the terminal.

You can also try julia ~/.julia/IJulia/src/kernel.jl to see if just launching the kernel, without connecting to any IPython front-end, dies by itself.

aelg commented 10 years ago

No additional info with verbose = true. Tried running the kernel

$ julia .julia/IJulia/src/kernel.jl
connect ipython with --existing /home/aelg/profile-9762.json
Segfault

Running the same thing with valgrind resulted in:

$ valgrind julia .julia/IJulia/src/kernel.jl 
==9780== Memcheck, a memory error detector
==9780== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==9780== Using Valgrind-3.9.0 and LibVEX; rerun with -h for copyright info
==9780== Command: julia .julia/IJulia/src/kernel.jl
==9780== 
==9780== Syscall param msync(start) points to unaddressable byte(s)
==9780==    at 0x64030B0: __msync_nocancel (in /usr/lib/libpthread-2.18.so)
==9780==    by 0x6DCA223: ??? (in /usr/lib/libunwind.so.8.0.1)
==9780==    by 0x6DCA743: ??? (in /usr/lib/libunwind.so.8.0.1)
==9780==    by 0x6DCCFBA: ??? (in /usr/lib/libunwind.so.8.0.1)
==9780==    by 0x6DCE261: ??? (in /usr/lib/libunwind.so.8.0.1)
==9780==    by 0x6DCE608: ??? (in /usr/lib/libunwind.so.8.0.1)
==9780==    by 0x6DCAB30: _ULx86_64_step (in /usr/lib/libunwind.so.8.0.1)
==9780==    by 0x5414DB7: ??? (in /usr/lib/julia/libjulia.so)
==9780==    by 0x5414E22: rec_backtrace (in /usr/lib/julia/libjulia.so)
==9780==    by 0x541539C: jl_throw (in /usr/lib/julia/libjulia.so)
==9780==    by 0x53D48E3: ??? (in /usr/lib/julia/libjulia.so)
==9780==    by 0x53EC430: jl_load_and_lookup (in /usr/lib/julia/libjulia.so)
==9780==  Address 0xffefff000 is not stack'd, malloc'd or (recently) free'd
==9780== 
==9780== Syscall param msync(start) points to uninitialised byte(s)
==9780==    at 0x64030B0: __msync_nocancel (in /usr/lib/libpthread-2.18.so)
==9780==    by 0x6DCA223: ??? (in /usr/lib/libunwind.so.8.0.1)
==9780==    by 0x6DCD006: ??? (in /usr/lib/libunwind.so.8.0.1)
==9780==    by 0x6DCE261: ??? (in /usr/lib/libunwind.so.8.0.1)
==9780==    by 0x6DCE608: ??? (in /usr/lib/libunwind.so.8.0.1)
==9780==    by 0x6DCAB30: _ULx86_64_step (in /usr/lib/libunwind.so.8.0.1)
==9780==    by 0x5414DB7: ??? (in /usr/lib/julia/libjulia.so)
==9780==    by 0x5414E22: rec_backtrace (in /usr/lib/julia/libjulia.so)
==9780==    by 0x541539C: jl_throw (in /usr/lib/julia/libjulia.so)
==9780==    by 0x53D48E3: ??? (in /usr/lib/julia/libjulia.so)
==9780==    by 0x53EC430: jl_load_and_lookup (in /usr/lib/julia/libjulia.so)
==9780==    by 0x8B23738: julia_blas_vendor1863 (in /usr/lib/julia/sys.so)
==9780==  Address 0xfff000030 is on thread 1's stack
==9780== 
connect ipython with --existing /home/aelg/profile-9780.json
==9780== 
==9780== Process terminating with default action of signal 11 (SIGSEGV)
==9780==  Bad permissions for mapped region at address 0x4170000
==9780==    at 0x4170000: ???
==9780== 
==9780== HEAP SUMMARY:
==9780==     in use at exit: 81,447,523 bytes in 66,362 blocks
==9780==   total heap usage: 1,108,692 allocs, 1,042,330 frees, 1,908,170,676 bytes allocated
==9780== 
==9780== LEAK SUMMARY:
==9780==    definitely lost: 253 bytes in 6 blocks
==9780==    indirectly lost: 192 bytes in 2 blocks
==9780==      possibly lost: 702,177 bytes in 7,511 blocks
==9780==    still reachable: 80,744,901 bytes in 58,843 blocks
==9780==         suppressed: 0 bytes in 0 blocks
==9780== Rerun with --leak-check=full to see details of leaked memory
==9780== 
==9780== For counts of detected and suppressed errors, rerun with: -v
==9780== Use --track-origins=yes to see where uninitialised values come from
==9780== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 2 from 2)
Killed
stevengj commented 10 years ago

@aelg, thanks for looking into this. Unfortunately, the valgrind output is not very helpful because I don't know what Julia code is being evaluated.

The only thing to do may be to start inserting println statements into kernel.jl and IJulia.jl so that you can find out where julia ~/.julia/IJulia/src/kernel.jl is crashing.

malmaud commented 10 years ago

One quick and dirty Julia debugging technique is just to put @show in front of a bunch of statements in kernel.jl.

dotfelixb commented 10 years ago

I had Anaconda IPython and julia did not work so I had to install IPython using easy_install ipython[all] and now ijulia works nice. so please try using easy_install ipython[all] may be it will be the fix you need.

julia> versioninfo(true) Julia Version 0.3.0-prerelease+1297 Commit 7f1c446* (2014-01-29 06:46 UTC) Platform Info: System: Darwin (x86_64-apple-darwin12.4.0) CPU: Intel(R) Core(TM) i5 CPU M 540 @ 2.53GHz WORD_SIZE: 64 uname: Darwin 12.4.0 Darwin Kernel Version 12.4.0: Wed May 1 17:57:12 PDT 2013; *patched-2050.24.15~1/RELEASE_X86_64 x86_64 i386 Memory: 6.0 GB (2030.3359375 MB free) Uptime: 5178.0 sec Load Avg: 0.629395 0.760742 0.763672 Intel(R) Core(TM) i5 CPU M 540 @ 2.53GHz: speed user nice sys idle irq

1 2526 MHz 3599 s 0 s 1330 s 46876 s 0 s

2 2526 MHz 1106 s 0 s 370 s 50323 s 0 s

3 2526 MHz 3706 s 0 s 1447 s 46647 s 0 s

4 2526 MHz 1078 s 0 s 362 s 50359 s 0 s

BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY) LAPACK: libopenblas LIBM: libopenlibm

aelg commented 10 years ago

Seems to be a problem with the heartbeat thread in heartbeat.jl. Everything seems to work if it is never started (commented away start_heartbeat(heartbeat) on line 59 in IJulia.jl). The seqfault happens immediately after that call. No idea whats causing it. Shouldn't the heartbeat thread return void*?

stevengj commented 10 years ago

I just tried Anaconda and IJulia and PyPlot on a fresh MacOS 10.9 machine, and everything went without a hitch. (Make sure you installed 64-bit Anaconda and 64-bit Julia.)

aelg commented 10 years ago

Could it be some version mismatch with ZMQ? I'm on version 4.0, the call to zmq_device in the heartbeat thread seems to at least be deprecated in that version. I'm not using Anaconda, I installed IPython (version 1.1) via Archs Package Manager.

aelg commented 10 years ago

Pull request https://github.com/JuliaLang/IJulia.jl/pull/142 by @cdsousa fixes the issue for me.