Closed sebastiang closed 8 years ago
Strange that this should be caught in jl_throw
since the latest fix should prevent julia from catching signals in 0.4; are you using the latest from master or the latest release?
The latest release. I will try latest from master.
I’ve found the difference in provisioning. There was a script I wasn’t running on the putative ‘production’ machines which did another incantation of apt-get update
after adding some sources to the list. It was to get a build of chrome for dev machines, but I imagine it must have subtly changed which libraries were being linked. I’ll work to get to the bottom of it.
it would appear I have to build the latest build with JL_OPTIONS_HANDLE_SIGNALS_OFF defined?
Running with gdb
and getting a trace on the error suggests a stack overflow. I bet the error is something on my side -- something not deployed correctly to my target machine. But whatever the problem is isn't shown to me because calls to find the error are somehow infinitely recurring.
#0 0x00007ffff6b9dc87 in _IO_vfprintf_internal (s=s@entry=0x7ffff4d8c6d0, format=<optimized out>,
format@entry=0x7ffff620919c "could not open file %s", ap=ap@entry=0x7ffff4d8c858) at vfprintf.c:1777
#1 0x00007ffff6bc42a3 in _IO_vasprintf (result_ptr=result_ptr@entry=0x7ffff4d8c800, format=format@entry=0x7ffff620919c "could not open file %s",
args=args@entry=0x7ffff4d8c858) at vasprintf.c:62
#2 0x00007ffff588368b in jl_vexceptionf (exception_type=0x7ffded9218d0, fmt=fmt@entry=0x7ffff620919c "could not open file %s",
args=args@entry=0x7ffff4d8c858) at builtins.c:56
#3 0x00007ffff5883b98 in jl_errorf (fmt=fmt@entry=0x7ffff620919c "could not open file %s") at builtins.c:73
#4 0x00007ffff58e4e26 in jl_load (fname=0x7ffdef581cb0 "/MyApp/node_modules/node-julia/lib/nj.jl", len=45) at toplevel.c:612
#5 0x00007fffee6dce80 in julia_include_680 () at boot.jl:261
#6 0x00007ffff587c16b in jl_apply (nargs=1, args=0x7ffff4d8ca90, f=<optimized out>) at julia.h:1328
#7 jl_apply_generic (F=0x7ffdef37d370, args=0x7ffff4d8ca90, nargs=<optimized out>) at gf.c:1684
#8 0x00007ffff58e7520 in jl_apply (nargs=1, args=0x7ffff4d8ca90, f=<optimized out>) at julia.h:1328
#9 jl_call1 (f=0x7ffdef37d370, a=0x7ffdee54e760) at jlapi.c:155
#10 0x00007ffff690349f in nj::Kernel::load() () from /MyApp/build/node_modules/node-julia/build/Release/nj.node
#11 0x00007ffff6904585 in nj::Kernel::invoke(std::string const&, _jl_value_t*, _jl_value_t*) ()
from /MyApp/build/node_modules/node-julia/build/Release/nj.node
#12 0x00007ffff69046b0 in nj::Kernel::getError(_jl_value_t*, _jl_value_t*) () from /MyApp/build/node_modules/node-julia/build/Release/nj.node
#13 0x00007ffff690f01f in nj::genJuliaError(_jl_value_t*) () from /MyApp/build/node_modules/node-julia/build/Release/nj.node
#14 0x00007ffff6911f2e in nj::getJuliaException(_jl_value_t*) () from /MyApp/build/node_modules/node-julia/build/Release/nj.node
#15 0x00007ffff6903608 in nj::Kernel::load() () from /MyApp/build/node_modules/node-julia/build/Release/nj.node
#16 0x00007ffff6904585 in nj::Kernel::invoke(std::string const&, _jl_value_t*, _jl_value_t*) ()
from /MyApp/build/node_modules/node-julia/build/Release/nj.node
#17 0x00007ffff69046b0 in nj::Kernel::getError(_jl_value_t*, _jl_value_t*) () from /MyApp/build/node_modules/node-julia/build/Release/nj.node
#18 0x00007ffff690f01f in nj::genJuliaError(_jl_value_t*) () from /MyApp/build/node_modules/node-julia/build/Release/nj.node
#19 0x00007ffff6911f2e in nj::getJuliaException(_jl_value_t*) () from /MyApp/build/node_modules/node-julia/build/Release/nj.node
#20 0x00007ffff6903608 in nj::Kernel::load() () from /MyApp/build/node_modules/node-julia/build/Release/nj.node
#21 0x00007ffff6904585 in nj::Kernel::invoke(std::string const&, _jl_value_t*, _jl_value_t*) ()
from /MyApp/build/node_modules/node-julia/build/Release/nj.node
The cycle (e.g. #17
-#21
) repeats as long as I'm willing to page through the results of backtrace
(gdb) thread apply all bt 8
Thread 3 (Thread 0x7ffff4d8a700 (LWP 8421)):
#0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1 0x00007ffff7704cdc in std::condition_variable::wait(std::unique_lock<std::mutex>&) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#2 0x00007ffff68fca1d in JMain::asyncQueueGet() () from /MyApp/build/node_modules/node-julia/build/Release/nj.node
#3 0x00007ffff690815f in Trampoline::operator()() () from /MyApp/build/node_modules/node-julia/build/Release/nj.node
#4 0x00007ffff7709e40 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#5 0x00007ffff6f1f182 in start_thread (arg=0x7ffff4d8a700) at pthread_create.c:312
#6 0x00007ffff6c4c47d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
Thread 2 (Thread 0x7ffff558b700 (LWP 8420)):
#0 0x00007ffff6b9dc87 in _IO_vfprintf_internal (s=s@entry=0x7ffff4d8c6d0, format=<optimized out>,
format@entry=0x7ffff620919c "could not open file %s", ap=ap@entry=0x7ffff4d8c858) at vfprintf.c:1777
#1 0x00007ffff6bc42a3 in _IO_vasprintf (result_ptr=result_ptr@entry=0x7ffff4d8c800, format=format@entry=0x7ffff620919c "could not open file %s",
args=args@entry=0x7ffff4d8c858) at vasprintf.c:62
#2 0x00007ffff588368b in jl_vexceptionf (exception_type=0x7ffded9218d0, fmt=fmt@entry=0x7ffff620919c "could not open file %s",
args=args@entry=0x7ffff4d8c858) at builtins.c:56
#3 0x00007ffff5883b98 in jl_errorf (fmt=fmt@entry=0x7ffff620919c "could not open file %s") at builtins.c:73
#4 0x00007ffff58e4e26 in jl_load (fname=0x7ffdef581cb0 "/MyApp/node_modules/node-julia/lib/nj.jl", len=45) at toplevel.c:612
#5 0x00007fffee6dce80 in julia_include_680 () at boot.jl:261
#6 0x00007ffff587c16b in jl_apply (nargs=1, args=0x7ffff4d8ca90, f=<optimized out>) at julia.h:1328
#7 jl_apply_generic (F=0x7ffdef37d370, args=0x7ffff4d8ca90, nargs=<optimized out>) at gf.c:1684
(More stack frames follow...)
Thread 1 (Thread 0x7ffff7fea780 (LWP 8416)):
#0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1 0x00007ffff7704cdc in std::condition_variable::wait(std::unique_lock<std::mutex>&) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#2 0x00007ffff68fbe0d in JMain::syncQueueGet() () from /MyApp/build/node_modules/node-julia/build/Release/nj.node
#3 0x00007ffff693ab99 in doImport(v8::FunctionCallbackInfo<v8::Value> const&) () from /MyApp/build/node_modules/node-julia/build/Release/nj.node
#4 0x00000000007b8e62 in v8::internal::FunctionCallbackArguments::Call(void (*)(v8::FunctionCallbackInfo<v8::Value> const&)) ()
#5 0x00000000007d9af1 in ?? ()
#6 0x00002486aa5060a2 in ?? ()
#7 0x00002486aa506001 in ?? ()
(More stack frames follow...)
it would appear I have to build the latest build with JL_OPTIONS_HANDLE_SIGNALS_OFF defined?
That value is defined in julia.h but only in version 0.4+, thus the ifdef.
The infinite recursion usually stems from some error when loading nj.jl since that acts as a generic error processor. If there's an error in nj.jl then it processes an error which loads nj.jl which causes an error, etc. In each of thread 1 and thread 3, it appears the process is blocked in cond_wait and doing nothing except waiting on the result. Thread 2 is where all the action is. if npm install succeeded, then lib/nj.jl
should exist but like you said maybe something about this install is messed up?
This is apt-get issue resolved? I've seen lots of problems with ubuntu apt-get install node because it puts it in /usr/bin/node and it's waaaay too old and then both n and (and probably nvm) put it in /usr/local/bin, and there are weird circumstances where both versions of node end up getting used simultaneously (especially by node-gyp).
It's al little hard to figure out what's going on, but it appears on a tightened up environment I get UVError
exceptions thrown if when calling import
through node-julia
, any pre-compilation needs to be carried out. If I arrange for all precompilation to happen out of band, then my app starts up properly.
I assume this is because of a failure to spawn a child julia
instance to do the compilation ,but just why, or why the error isn't cleanly conveyed is somewhat beyond me.
I also get a hang when a binary dependency is missing (bad deployment my part), but obviously triggering the stack overflow is problematic; we can't see the underlying error.
I'll close this as I have a workaround and can't contribute enough context to reliably reproduce.
I am trying to figure out why an
import
of a Julia modules works on one machine, but hangs on another. While I hunt down what differences might exist between the machine images -- each provisioned the same way -- I thought I'd post the stack trace to see if it inspired any ideas. Node0.12.7
, juliav0.4-rc2
. I start the app, it hangs on animport
statement, and after a while ICTRL+C
. If I do it quickly, I just get a plain segfault. If I wait a while, I get something like this.