nodejs / llnode

An lldb plugin for Node.js and V8, which enables inspection of JavaScript states for insights into Node.js processes and their core dumps.
Other
1.15k stars 99 forks source link

README asks to install a seemingly incompatible lldb version #61

Closed scottinet closed 7 years ago

scottinet commented 7 years ago

Hi!

The current llnode README documentation states that the lldb version to install is 3.8. Well... I copy/pasted the README installation instructions, but all I got when trying to load a coredump is another coredump from lldb.

The coredump I try to load is generated by a process running either under node 4.6.1 or 6.9.1 (I tried both):

$ lldb-3.8 node -c core
(lldb) target create "node" --core "core"
Segmentation fault (core dumped)

Same thing happens with:

Note that gdb has no problem loading my coredumps files. Of course I can't do much with gdb. :-)

It's when I stumbled upon this PR that I realized that you yourselves use lldb 3.6. So I tried that version, and now everything work fine.

This issue is more about the README documentation than anything but, if needs be, I will be happy to provide any additional information you need to pinpoint where the problem is.

hhellyer commented 7 years ago

Personally I use lldb-3.9 if I'm using a pre-built version otherwise I usually use the head of the lldb development stream built locally. I've tested with several versions of lldb and so long as the headers match up llnode should work. I'm not sure what happens if you run make install when you have multiple versions of lldb installed - IIRC Ubuntu has packages for lldb 3.6, 3.7, 3.8 and 3.9 available at least and a plugin built for one of those is unlikely to work with any of the others.

If lldb is crashing even without the llnode plugin installed that sounds like an lldb issue. If it's crashing with the plugin installed then there's likely to be a mismatch between the headers for lldb and the installed version of lldb. Could you post the output of lldb -v and the branch name you are using to download the lldb headers? (The command line you use to run gyp_llnode would be useful too.)

If installation is causing a problem it might be worth trying an npm install via: > npm install hhellyer/llnode#npm

I've currently got a PR open for integrating it into the main llnode project and it tries to auto-detect the version of headers to download to match your version of lldb. This would be an interesting test of how well that works!

rnchamberlain commented 7 years ago

Agree the Linux build instructions in the README need improvement - so the header install and the configure commands are more general, not pointing at 3.8 specifically.

We want folks to get to LLDB 3.9 if possible so that the findjsobjects command works without the need to use LLNODE_RANGESFILE and the scripts.

rnchamberlain commented 7 years ago

Also the sudo make install-linux step might be a problem, if there are multiple copies of LLDB installed.

scottinet commented 7 years ago

That's definitely a lldb problem, not a llnode one. As I said, I just wanted to bring your attention on the fact that I was unable to load coredumps with lldb 3.8 and up, for some reason.

I'm currently running on a Debian Jessie docker image, and installed the following packages from lldb nightlies directly (http://apt.llvm.org/):

The result of lldb -v:

$ lldb-3.8 -v
lldb version 3.8.1 ( revision )

Note that I performed the same tests from outside docker, on a Ubuntu Xenial system, with the exact same procedures, and the exact same results.

I'll try lldb 3.9 right away and let you know if that changes anything.

@hhellyer > tried installing your node module, I got this error: Unable to locate lldb binary. llnode installation failed. Issue: when installing from apt on debian or ubuntu, the lldb binary is always named lldb-<version>. I had to create a symlink named lldb to make your branch install work.

hhellyer commented 7 years ago

For Ubuntu 1.10 there are multiple lldb levels (up to 3.9) available in the default Ubuntu repositories. I think only some of these actually link lldb to the lldb-3.x binary. (3.9 doesn't seem to, Ildb-3.8 does.) I think it might be worth updating the npm installer to search for the most recent lldb available on Linux. My main issue there is I'm not sure what should win, the default "lldb" version or the highest lldb-3.x version. I might have to add an env var to let the user force it one way or the other if they don't like the default.

scottinet commented 7 years ago

Trying to isolate the problem, I created a fresh Debian Jessie server and reran tests:

=> lldb still crashes with a segfault when trying to load a node-generated coredump.

From where I stand, I cannot debug node coredumps with lldb 3.8 and up. It does work fine with lldb 3.6 though. And llnode is tremendously useful, thanks a lot for this project. :-)

Once again, this does not come from llnode, so I'll try to find if there are lldb issues related to this problem. If I find anything, are you interested to know? If not, I'll stop polluting this project with my issues. :-)

rnchamberlain commented 7 years ago

Works OK for me on Ubuntu Xenial and lldb 3.9, using a dump from node -abort-on-uncaught-exception

$ulimit -c unlimited
$node -abort-on-uncaught-exception foo
...
Illegal instruction (core dumped)
$ lldb -version
lldb version 3.9.1 ( revision )
$ lldb node -c core
(lldb) target create "node" --core "core"
Core file '/home/rchamberlain/core' (x86_64) was loaded.

I had a look on the lldb bugzilla here https://llvm.org/bugs/ but nothing immediately stands out. It could be something specific in the dump that lldb is failing on. Agree its an lldb issue not llnode, but yes we would be interested to follow up. Using lldb or gdb to get the native stack trace from the second dump might be a useful next step.

hhellyer commented 7 years ago

We are interested, we've actually put in a few fixes to lldb but I think the latest of those are only in the lldb 4.0 stream which hasn't released here. Although it's not an llnode issue it's something other users may see and I think you might be hitting a similar issue to the one reported in the comments here (also reported today): https://developer.ibm.com/node/2016/08/15/exploring-node-js-core-dumps-using-the-llnode-plugin-for-lldb/

scottinet commented 7 years ago

@rnchamberlain > your example work perfectly fine for me, too.

The problem comes from the way I'm generating core dumps. I need to get them programatically without killing my project, so I use gcore, which itself uses google coredumper. It seems that, starting from 3.8, lldb cannot load these coredumps without crashing.

Sorry, I should have checked that first.

I'll fix that on my end either by updating google-coredumper in our own modules or by using another way to generate coredumps.

rnchamberlain commented 7 years ago

Ah, we have seen some issues with gcore generated dumps. Though we want those to work too! Also - the gcore npm/google coredumper might produce something a bit different from Linux gcore.

@hhellyer the one reported on your blog has the same stack trace as this bugzilla https://llvm.org/bugs/show_bug.cgi?id=25106

rnchamberlain commented 7 years ago

Reproduced the crash on lldb 3.9 using a core dump generated by the gcore npm module, as above. Some lldb logging shows that it is finding threads ok but then hits an assert in Plugins/Process/elf-core/ThreadElfCore.cpp:

(lldb) log enable lldb process thread
(lldb) target create "node" --core "core.1234"
....
elf-core::CreateRegisterContextForFrame:: Architecture(27) or OS(0) not supported
Segmentation fault (core dumped)

The google coredumper in the node-gcore project is dated Mar 2015, so reasonably recent. We do know that the lldb code was changed after 3.6 to be more picky about the elf core headers.

hhellyer commented 7 years ago

Unfortunately I've tried the gcore npm on two linux VM's (Ubuntu and Fedora) and it doesn't work on either. You could try generating a core using the gcore program (which comes with gdb) with something like: child_process.execFileSync('/usr/bin/gcore', [process.pid]).toString(); instead. lldb 4.0 should be able to open that.

scottinet commented 7 years ago

I agree, and since Google coredumper doesn't seem to be maintained since 2008, problems will only get worse. I've done some researches, and I couldn't find a more portable way to get a coredump programatically than using gcore either, so that's what I'll do.

I'll also raise an issue on the npm/gcore project to prevent other people using it if they plan to use llnode.

Thanks a lot for your help.

hhellyer commented 7 years ago

@scottinet - Did you manage to get a core dump you could open in by exec'ing gcore? Are you able to investigate your original problem now?

(Linux doesn't have a core dump API so we do end up using work around like running gcore.)

With regards to the original problem would it make sense to update the README when we put the npm up? Hopefully that'll become the standard install process and we'll need to change the docs then anyway.

scottinet commented 7 years ago

Well... no. :-(

I was able to get a coredump just fine, after deactivating ptrace protection.

But when I try to load it, lldb hangs indefinitely (3.8 and 3.9), and lldb-4.0 segfaults:

$ lldb-4.0 $(which lldb-4.0) -c core
(lldb) target create "/usr/bin/lldb-4.0" --core "core"
Core file '/home/scottinet/git/kuzzle/dump/20161209-1317-cli/core' (i386) was loaded.
(lldb) bt
* thread #1, name = 'lldb-4.0', stop reason = signal SIGSEGV
  * frame #0: 0xf59a1cf4 liblldb-4.0.so.1`___lldb_unnamed_symbol5938$$liblldb-4.0.so.1 + 36
    frame #1: 0xf5e307cf liblldb-4.0.so.1`___lldb_unnamed_symbol18166$$liblldb-4.0.so.1 + 863
    frame #2: 0xf5e330cc liblldb-4.0.so.1`___lldb_unnamed_symbol18180$$liblldb-4.0.so.1 + 348

On the other hand, lldb-3.6 works perfectly with a gdb/gcore generated coredump.

scottinet commented 7 years ago

About the README file, I don't have an opinion except, maybe, making it clearer what LLDB version is used (for instance using env variables). Using NPM is fine by me too. Either way, I found it easy and straightforward to install llnode.

hhellyer commented 7 years ago

The hang on 3.8 and 3.9 was this: https://llvm.org/bugs/show_bug.cgi?id=26322

I've not hit a crash with 4.0 though I've been using builds out of my local workspace. Are you running the 32 bit version of lldb? I've no idea if that would make a difference but I've never tried that and it's possible there might be an issue if your original core dump from node is from a 64 bit process.

scottinet commented 7 years ago

Oh god, didn't thought of that. And you're right: llvm 4.0 toolchain seems to be only available as a package for i386 right now. I'll try to compile it for amd64 manually but I won't spend much time on this, and I'll probably keep an eye and grab it again once it's available in 64 bits.

indutny commented 7 years ago

Closing this, please re-open if relevant.

scottinet commented 7 years ago

Just a follow-up: everything work fine using lldb4.0 x64

bioball commented 7 years ago

I'm trying to do the same thing. I have a core dump generated using gcore, but unable to open it in lldb-3.9 and it just hangs.

Scott, how did you install lldb-4.0? Seems like it's not available for Ubuntu 14.04. And is there a way to generate a core file that's compatible for lldb-3.9?

hhellyer commented 7 years ago

LLDB was unable to open cores generated via gcore. This was fixed (see above) but I think the fix went into lldb-4.0 not lldb-3.9. A core file generated by a crash (SEGV etc) is still readable. If you don't mind terminating the running application you might want to try killing the process using a signal that will generate a core dump.

I tend to use kill -24 <pid> (SIGXCPU) just because it's unlikely anyone is handling it. Make sure that the ulimit for core dumps is set to something large enough to allow you to create a core dump. If you want to trigger the core dump from within your node program just do: child_process.exec(`kill -24 ${process.pid}`);

rnchamberlain commented 7 years ago

@bioball It should be possible to get lldb-4.0 for Ubuntu 14.04 ('trusty') - it is listed on http://apt.llvm.org/. I put some lldb install instructions here: https://developer.ibm.com/node/2016/09/27/advances-in-core-dump-debugging-for-node-js/

You should just need to change the apt-add-repository line to this: sudo apt-add-repository "deb http://apt.llvm.org/trusty/ llvm-toolchain-trusty-4.0 main"

The hang was fixed by @hhellyer last year, see https://reviews.llvm.org/D26676, but as he says, maybe that fix did not get backported to lldb 3.9.

bnoordhuis commented 7 years ago

Didn't realize it was Howard who fixed that. Nice work, @hhellyer!

bioball commented 7 years ago

@rnchamberlain Ah! Turns out what was happening is that I had a messed up sources.list file. When trying to install lldb-4.0, I had added the wrong debian repository.

Just installed lldb-4.0, and everything works great now. Thanks!