Arbitrary signal values when a submission program commits suicide

Bosco89 commented 11 years ago

When the sandbox is invoked with system call filtering (-f), if the child process tries to call raise(), kill() or tgkill() to kill itself, the sandbox reports the signal argument ("Committed suicide by signal #") and dies, regardless of the value of the signal (which is a 32-bit integer). The signal is then displayed in the submission outcome. A contestant can exploit this issue to reliably get pieces of the input data by using tokens, with little effort.

giomasce commented 11 years ago

This behavior may be automatically fixed with the new sandbox, but I'm not sure.

bblackham commented 11 years ago

The new sandbox does not really help here. Well, in practice, it does kind-of - it removes a few bits of entropy. For various Linux-internal reasons, the candidate executable cannot deliver any signals to itself (it runs as PID 1 in its own container, which makes it immune to non-kernel generated signals). But it can still die in at least one of two ways (sigsegv, div-by-0), giving one bit of data each time.

If you were determined, you could probably get more entropy by spinning for a certain amount of time, based on the input data.

The limited number of tokens mitigate this on non-public test cases. So I don't think it's a major issue.

lw commented 11 years ago

I didn't yet take a look at the new sandbox, nor did I use it in any way, therefore I'm asking if it provides a way to reliably detect out-of-memory failures (in my experience with the old sandbox these usually manifested themselves in a large range of possible outcomes and signals, based on how glibc was trying to get that memory, like signal 6, signal 11 and some others I can't remember).

If that's the case then we can do as it's done in the ACM ICPC (I think...), that is reduce the outcome of the evaluation to just a few states: accepted, wrong answer, time limit exceeded, memory limit exceeded and a generic runtime error for all other errors.

Sure, we could do it even if we had no way to detect out-of-memory failures, but I'm against it because I think that given the relative low experience of high school contestants (the "average" users of CMS, for now) we should allow to distinguish between two of the most common errors (out-of-memory and segmentation fault), and casting them both to "runtime error" doesn't help.

In any case it would still be possible to get information about the input data based on time and memory consumption of the program, but I think that's unavoidable.

bblackham commented 11 years ago

In the ptrace-based sandboxes, you could inspect the results of system calls and, for example, see brk() or mmap() returning ENOMEM. If the program subsequently crashed, you could treat this an out of memory condition. mo-box didn't do this, but our Australian one did.

The new sandbox doesn't have this ability as it doesn't use ptrace. Also, as you mention, there are a number of different ways you can run out of memory. Here's the ways that I have witnessed:

static heap is too big (e.g. massive globals) => kernel's ELF loader makes execve return -ENOMEM
static heap nearly too big => ELF loads, but dynamic linker cannot initialize due to lack of memory => program dies on sig 11 immediately
static heap nearly too big v2 => ELF loads, dynamic linker loads, first printf crashes because it cannot allocate a buffer => sig 11
run out of memory from malloc() => NULL pointer which is normally unchecked and dereferenced by student => sig 11
run out of memory from new (C++) => std::bad_alloc thrown, resulting in signal 6
run out of memory from new (Pascal) => runtime error, printed to stderr, program exits with an interesting exit code
run out of stack memory (Pascal) => runtime error, printed to stderr, program exits with an interesting exit code
run out of stack memory (C/C++) => sig 11, but the only evidence is that your stack pointer no longer points past the top of the stack

You could implement hacks ("strategies") to catch many of these cases (most involve hooking signal handlers, LD_PRELOADing malloc/free, parsing stderr for pascal, etc), however there is a risk of introducing other weird bugs.

bblackham commented 11 years ago

Closing, as the new sandbox has resolved the reported issue, but has potentially gone to the other extreme (see #141).

cms-dev / cms

Arbitrary signal values when a submission program commits suicide #58