JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.48k stars 5.46k forks source link

Current master does not start on Mac OS 10.10 #9380

Closed dpsanders closed 9 years ago

dpsanders commented 9 years ago

Julia 0.4 was previously working. A git pull and recompile today gives the error

10:16 $ julia-dev
Abort trap: 6

A fresh git clone of master also gives this.

Running in lldb gives

10:14 $ lldb julia
(lldb) target create "julia"
Current executable set to 'julia' (x86_64).
(lldb) run
Process 79877 launched: '/Users/dsanders/development/julia-dev/julia/julia' (x86_64)
Process 79877 stopped
* thread #1: tid = 0x178f5e, 0x00007fff8e544282 libsystem_kernel.dylib`__pthread_kill + 10, queue = 'com.apple.main-thread', stop reason = signal SIGABRT
    frame #0: 0x00007fff8e544282 libsystem_kernel.dylib`__pthread_kill + 10
libsystem_kernel.dylib`__pthread_kill + 10:
-> 0x7fff8e544282:  jae    0x7fff8e54428c            ; __pthread_kill + 20
   0x7fff8e544284:  movq   %rax, %rdi
   0x7fff8e544287:  jmp    0x7fff8e53fca3            ; cerror_nocancel
   0x7fff8e54428c:  retq   

System info from Julia 0.3:

julia> versioninfo()
Julia Version 0.3.4-pre+47
Commit f3c3551* (2014-12-10 05:25 UTC)
Platform Info:
  System: Darwin (x86_64-apple-darwin14.0.0)
  CPU: Intel(R) Core(TM) i7-3720QM CPU @ 2.60GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Sandybridge)
  LAPACK: libopenblas
  LIBM: libopenlibm
  LLVM: libLLVM-3.3
jakebolewski commented 9 years ago

Perhaps a make distclean is in order? I just built the latest master from a fresh pull on the same os / hardware and it worked fine. Feel free to re-open if you still encounter problems.

ivarne commented 9 years ago

A fresh git clone of master also gives this.

I think that would be even better than a distclean. It works on OSX for me though, so there is probably something funny going on. cc: @staticfloat

(Also, non-collaborators can't reopen issues when a collaborator closed them, so we should be careful and rather suggest that they comment, and we will reopen.)

tkelman commented 9 years ago

@dpsanders do you know how to do git bisect?

staticfloat commented 9 years ago

If you could build a debug build, (make debug) and then get a backtrace from within lldb (that is, once you've crashed Julia, run bt from within lldb and post the output) that would be very helpful. You'll need to run the julia-debug executable from within lldb.

dpsanders commented 9 years ago

The debug build works perfectly...! There is a difference in that I built it with only 1 thread. I will try a normal build with a single thread to see if that helps.

I am not too familiar with git bisect, and since, in any case, there are several reports from other people who do not see the same problem, I'll look for other solutions first.

dpsanders commented 9 years ago

The single-thread standard build gives the same Abort trap: 6 error... Looks like git bisect is on the cards...

staticfloat commented 9 years ago

A backtrace from the non-debug build would also be helpful.

skumagai commented 9 years ago

I am in the same boat: make debug generates working julia, but make does not. I ran make distcleanall beforehand but didn't fresh-clone the repository.

Backtrace from broken master gives:

➜  julia git:(master) git log --oneline -n 1 | cat
59b6080 Merge pull request #9346 from sbromberger/v4-constructors
➜  julia git:(master) lldb ./julia                                                                                                          
(lldb) target create "./julia"
Current executable set to './julia' (x86_64).
(lldb) run
Process 98311 launched: './julia' (x86_64)
Process 98311 stopped
* thread #1: tid = 0x3c99e0, 0x00007fff8831a282 libsystem_kernel.dylib`__pthread_kill + 10, queue = 'com.apple.main-thread', stop reason = signal SIGABRT
    frame #0: 0x00007fff8831a282 libsystem_kernel.dylib`__pthread_kill + 10
libsystem_kernel.dylib`__pthread_kill + 10:
-> 0x7fff8831a282:  jae    0x7fff8831a28c            ; __pthread_kill + 20
   0x7fff8831a284:  movq   %rax, %rdi
   0x7fff8831a287:  jmp    0x7fff88315ca3            ; cerror_nocancel
   0x7fff8831a28c:  retq   
(lldb) bt
* thread #1: tid = 0x3c99e0, 0x00007fff8831a282 libsystem_kernel.dylib`__pthread_kill + 10, queue = 'com.apple.main-thread', stop reason = signal SIGABRT
  * frame #0: 0x00007fff8831a282 libsystem_kernel.dylib`__pthread_kill + 10
    frame #1: 0x00007fff92e2c4c3 libsystem_pthread.dylib`pthread_kill + 90
    frame #2: 0x00007fff8ed6ac13 libsystem_c.dylib`__abort + 145
    frame #3: 0x00007fff8ed6b4f1 libsystem_c.dylib`__stack_chk_fail + 197
    frame #4: 0x000000010006d58a libjulia.dylib`jl_get_system_image_cpu_target(fname=<unavailable>) + 154 at dump.c:1376
    frame #5: 0x0000000100067abf libjulia.dylib`_julia_init(rel=<unavailable>) + 399 at init.c:884
    frame #6: 0x0000000100068a6d libjulia.dylib`julia_init(rel=<unavailable>) + 13 at task.c:275
    frame #7: 0x0000000100001f55 julia`main(argc=0, argv=0x00007fff5fbff738) + 69 at repl.c:355
    frame #8: 0x00000001000018f4 julia`start + 52

Commit dab8a3503abf4f75b621936df5f0083ad878d316 introduced this breakage (bisected).

My machine is fairly old MBP from Mid-2010. System info from the last working commit:

julia> versioninfo()
Julia Version 0.4.0-dev+2141
Commit f167d25 (2014-12-14 23:50 UTC)
Platform Info:
  System: Darwin (x86_64-apple-darwin14.0.0)
  CPU: Intel(R) Core(TM)2 Duo CPU     P8800  @ 2.66GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Penryn)
  LAPACK: libopenblas
  LIBM: libopenlibm
  LLVM: libLLVM-3.3
staticfloat commented 9 years ago

This is officially my fault then. :) this is due to my build sysimg work. On Dec 17, 2014 10:35 PM, "Seiji Kumagai" notifications@github.com wrote:

I am in the same boat: make debug generates working julia, but make does not. I ran make distcleanall beforehand but didn't fresh-clone the repository.

Backtrace from broken master gives:

➜ julia git:(master) git log --oneline -n 1 | cat 59b6080 Merge pull request #9346 from sbromberger/v4-constructors ➜ julia git:(master) lldb ./julia (lldb) target create "./julia" Current executable set to './julia' (x86_64). (lldb) run Process 98311 launched: './julia' (x86_64) Process 98311 stopped

  • thread #1: tid = 0x3c99e0, 0x00007fff8831a282 libsystem_kernel.dylib__pthread_kill + 10, queue = 'com.apple.main-thread', stop reason = signal SIGABRT frame #0: 0x00007fff8831a282 libsystem_kernel.dylibpthread_kill + 10 libsystem_kernel.dylib`pthread_kill + 10: -> 0x7fff8831a282: jae 0x7fff8831a28c ; __pthread_kill + 20 0x7fff8831a284: movq %rax, %rdi 0x7fff8831a287: jmp 0x7fff88315ca3 ; cerror_nocancel 0x7fff8831a28c: retq (lldb) bt
  • thread #1: tid = 0x3c99e0, 0x00007fff8831a282 libsystem_kernel.dylib`__pthread_kill + 10, queue = 'com.apple.main-thread', stop reason = signal SIGABRT
    • frame #0: 0x00007fff8831a282 libsystem_kernel.dylib__pthread_kill + 10 frame #1: 0x00007fff92e2c4c3 libsystem_pthread.dylibpthread_kill + 90 frame #2: 0x00007fff8ed6ac13 libsystem_c.dylib__abort + 145 frame #3: 0x00007fff8ed6b4f1 libsystem_c.dylib__stack_chk_fail + 197 frame #4: 0x000000010006d58a libjulia.dylibjl_get_system_image_cpu_target(fname=<unavailable>) + 154 at dump.c:1376 frame #5: 0x0000000100067abf libjulia.dylib_julia_init(rel=) + 399 at init.c:884 frame #6: 0x0000000100068a6d libjulia.dylibjulia_init(rel=<unavailable>) + 13 at task.c:275 frame #7: 0x0000000100001f55 juliamain(argc=0, argv=0x00007fff5fbff738) + 69 at repl.c:355 frame #8: 0x00000001000018f4 julia`start + 52

Commit dab8a35 https://github.com/JuliaLang/julia/commit/dab8a3503abf4f75b621936df5f0083ad878d316 introduced this breakage (bisected).

My machine is fairly old MBP from Mid-2010. System info from the last working commit:

julia> versioninfo() Julia Version 0.4.0-dev+2141 Commit f167d25 (2014-12-14 23:50 UTC) Platform Info: System: Darwin (x86_64-apple-darwin14.0.0) CPU: Intel(R) Core(TM)2 Duo CPU P8800 @ 2.66GHz WORD_SIZE: 64 BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Penryn) LAPACK: libopenblas LIBM: libopenlibm LLVM: libLLVM-3.3

— Reply to this email directly or view it on GitHub https://github.com/JuliaLang/julia/issues/9380#issuecomment-67446658.

staticfloat commented 9 years ago

@dpsanders is your mac older as well? Perhaps there's something about older hardware that's causing an issue. I did change how Julia executables interact with differing cpu instruction sets, so that might be what's causing the issue here.

ViralBShah commented 9 years ago

10.10 won't run on Macs that are too old.

dpsanders commented 9 years ago

I have a mid-2012 retina MacBook Pro. I'm not sure where to get information about the chipset?

Here's the backtrace:

09:45 $ lldb ./julia 
(lldb) target create "./julia"
Current executable set to './julia' (x86_64).
(lldb) run
Process 33023 launched: './julia' (x86_64)
Process 33023 stopped
* thread #1: tid = 0xc3361, 0x00007fff8cd48282 libsystem_kernel.dylib`__pthread_kill + 10, queue = 'com.apple.main-thread', stop reason = signal SIGABRT
    frame #0: 0x00007fff8cd48282 libsystem_kernel.dylib`__pthread_kill + 10
libsystem_kernel.dylib`__pthread_kill + 10:
-> 0x7fff8cd48282:  jae    0x7fff8cd4828c            ; __pthread_kill + 20
   0x7fff8cd48284:  movq   %rax, %rdi
   0x7fff8cd48287:  jmp    0x7fff8cd43ca3            ; cerror_nocancel
   0x7fff8cd4828c:  retq   
(lldb) bt
* thread #1: tid = 0xc3361, 0x00007fff8cd48282 libsystem_kernel.dylib`__pthread_kill + 10, queue = 'com.apple.main-thread', stop reason = signal SIGABRT
  * frame #0: 0x00007fff8cd48282 libsystem_kernel.dylib`__pthread_kill + 10
    frame #1: 0x00007fff8754f4c3 libsystem_pthread.dylib`pthread_kill + 90
    frame #2: 0x00007fff8e3fdc13 libsystem_c.dylib`__abort + 145
    frame #3: 0x00007fff8e3fe4f1 libsystem_c.dylib`__stack_chk_fail + 197
    frame #4: 0x000000010006cd8a libjulia.dylib`jl_get_system_image_cpu_target(fname=<unavailable>) + 154 at dump.c:1376
    frame #5: 0x00000001000672bf libjulia.dylib`_julia_init(rel=<unavailable>) + 399 at init.c:884
    frame #6: 0x000000010006826d libjulia.dylib`julia_init(rel=<unavailable>) + 13 at task.c:275
    frame #7: 0x0000000100001f55 julia`main(argc=0, argv=0x00007fff5fbffb58) + 69 at repl.c:355
    frame #8: 0x00000001000018f4 julia`start + 52
dpsanders commented 9 years ago

Oh, sorry, didn't see that the backtrace had already been posted.

skumagai commented 9 years ago

Well... both machines (mid-2010 and mid-2012) with this problem are recent enough to run 10.10. Lifting from apple's support page, 10.10 supports:

iMac (Mid-2007 or newer)
MacBook (Late 2008 Aluminum, or Early 2009 or newer)
MacBook Pro (Mid/Late 2007 or newer)
MacBook Air (Late 2008 or newer)
Mac mini (Early 2009 or newer)
Mac Pro (Early 2008 or newer)
Xserve (Early 2009)
staticfloat commented 9 years ago

Right, we've already established that this is running on 10.10. ;)

Could you guys please apply this patch, rebuild, and post the output? To apply the patch, you can just run the following from within the main Julia directory:

$ curl https://gist.githubusercontent.com/staticfloat/9d5e3caeba8cb4d66e14/raw/995b71cc427f080374f41fec8d2d84d20a0df46a/dump.c.diff | patch -p1
skumagai commented 9 years ago

Sorry for being late. Here is the requested output after make clean && make:

➜  julia git:(master) ✗ ./usr/bin/julia 
Entering jl_get_system_image_cpu_target()...
  strlen(fname): 48
  fname: /Users/sk130/Projects/julia/usr/lib/julia/sys.ji
  fname_shlib: /Users/sk130/Projects/julia/usr/lib/julia/sys
  jl_sysimg_cpu_target: native
[2]    27044 abort      ./usr/bin/julia

and after make clean && make debug:

➜  julia git:(master) ✗ ./usr/bin/julia-debug   
Entering jl_get_system_image_cpu_target()...
  strlen(fname): 48
  fname: /Users/sk130/Projects/julia/usr/lib/julia/sys.ji
  fname_shlib: /Users/sk130/Projects/julia/usr/lib/julia/sys
  jl_sysimg_cpu_target: native
(snip)
julia>

Seems identical.

staticfloat commented 9 years ago

What make vars are you guys using? Nothing? Just straight make?

skumagai commented 9 years ago

I'm using make without any argument. Should I set something?

skumagai commented 9 years ago

Commit 605c36323 seems to fix the problem. That commit onward make clean && make generates working julia.

tkelman commented 9 years ago

@dpsanders can you confirm that?

lindahua commented 9 years ago

I just built this on my Mac OS X 10.10.1 from a fresh clone, and it works.

tkelman commented 9 years ago

@lindahua thanks but unless you were having issues between dab8a3503abf4f75b621936df5f0083ad878d316 and 605c36323486581f4a3016b8a165d6e28e9157f4 that's not too helpful, as this appears to only happen on specific hardware. I would really like to hear from @dpsanders though since so far our sample size is 2, and we've only heard that it's confirmed fixed for 1 of those. And this is important to get right for a PR we intend to backport for the imminent 0.3.4, #9376. It would be awesome if @skumagai and any others who had seen this problem on master could also test that PR.

dpsanders commented 9 years ago

Sorry for the delay in replying -- for some reason I didn't see the latest messages by email.

I confirm that a fresh clone now compiles and runs correctly -- many thanks!

tkelman commented 9 years ago

Excellent. Closing the issue then.