hyperboria / bugs

Peer-to-peer IPv6 networking, secure and near-zero-conf.
153 stars 17 forks source link

How to make it easy to get core dumps? #51

Open viric opened 9 years ago

viric commented 9 years ago

I think that in the current state of cjdns core dumps are very valuable. We have assertion failures of weird cases that we would like to fix. A core dump allows to keep a perfect picture of the failing situation without any other overhead than the creation of a file in disk. Once we have a core dump, the cjdroute user can attach gdb to it.

Core dumps contain private information and are only meaningful to those who own a cjdroute binary and the source code they built it from. That usually means that you should not send them to anyone (because of the private information) and that they will be useless to anyone but you (other people don't have your cjdroute binary or the exact source code you built it from). So the main usage of core dumps is the cjdns user to be able to attach gdb to it, and then report upstream only backtraces and variable contents.

To get core dumps (in GNU/Linux) from cjdroute we need to consider a few things:

  1. In a common GNU system, core dumps will be disabled.
  2. After setuid(), processes usually lose the capability to dump cores.
  3. With chroot/chdir to a root-owned directory, a cjdroute running as nobody will not be able to create a file in that directory.

What I use to overcome all that is:

  1. Before calling cjroute, from the same shell that will call it, I allow storing any size coredumps: ulimit -c unlimited
  2. I enable in the system core dumps after setuid: echo 1 > /proc/sys/fs/suid_dumpable
  3. Create a directory reachable only by root, but writable by nobody. As root: mkdir ~/cjdns; chown nobody ~/cjdns
  4. Make cjdroute chroot to that instead of the default /var/run. Add { "chroot": "/root/cjdns" } in the "security" cjdroute.conf list.

With that, the next time cjdroute crashes, you will see a core file there. You can attach gdb to it by running (you should not have changed the cjdroute since the core was dumped):

gdb cjdroute /root/cjdns/core

To test that the cores are dumped, you can test it by killing cjdroute with SIGABRT and checking if the core file is created:

kill -ABRT `pgrep cjdroute`

If they are not, ensure for example that cjdroute runs in the chroot path you wrote in cjdroute.conf:

$ readlink /proc/`pgrep cjdroute`/cwd
/root/cjdns

If you restart cjdroute automatically on crash, then one core dump will overwrite another. If you care about that problem you can enable the core.PID kind of filenames by enabling that: echo 1 > /proc/sys/kernel/core_uses_pid

This information about core dumps could be added to the /doc directory of cjdns.

Shnatsel commented 9 years ago

This has been previously implemented for Ubuntu in a rather generic manner. The tool that harvests core dumps is called Apport. It does so in an external process so cjdns itself doesn't have to do anything. It gets you the core dump with a bunch of info on top of it and you can define custom files to attach to the report. The reports are by default private to a small team of developers you choose, since they can contain sensitive information.

The downsides are: it requires us to either use https://launchpad.net/ bug tracker to get core dumps, or to write a custom submission component; and as-is it only works on packages, so you'll get only reports from my repo (https://github.com/Shnatsel/cjdns-ubuntu-pubkey); it is currently Ubuntu- and Debian-specific.

I have prior experience with deploying this tool in production for https://elementaryos.org/ More info on Apport: https://wiki.ubuntu.com/Apport