Feature Request - Allow Memory Oversubscription

MontyCarter commented 8 years ago

It would be really nice to be able to oversubscribe memory. I realize this causes results to be unreliable, however, in preparation for svcomp it would be nice to more fully utilize our servers.

For example, we have some machines with 32 cores (64 with HT) and 128GB. To simulate svcomp, we need each concurrent benchmark to have 15GB ram, which means we can only run 8 benchmarks concurrently. However, visually monitoring total system memory usage never really shows more than around 32GB in use at a time.

It would be nice to have a mode for benchexec where memory can be oversubscribed. Since it is unlikely that all concurrent benchmarks will peak their memory consumption at the same time, it would allow us to run more benchmarks concurrently. For example, (on this system) allow 16 concurrent benchmarks to run, each with 15GB of dedicated memory, despite there only being 128GB of physical memory.

If it does happen that all physical memory gets used, it would be fine for my purposes if benchexec just aborted. (This is probably preferable to allowing things to continue while utilizing swap space, since this could introduce timeout results that wouldn't be there without memory oversubscription).

Perhaps this isn't feasible with cgroups, but if the total memory outlay enforcement is in benchexec rather than cgroups, it would be nice to be able to more full utilize the hardware.

To give an idea of my imagined workflow, we would run lots of benchmarks concurrently as we fix bugs in our tool. Each bug fix will warrant re-running benchmarks so we can ensure the bug fix has resolved the issue (and hasn't introduced new bugs). During this time, the performance (runtime, memory consumption) isn't particularly important. This is when oversubscription would be useful.

After we freeze bug fixes to the tool, we would want to rerun things, but more faithfully simulate svcomp. During this time, performance would be relevant, so we wouldn't want to oversubscribe memory.

MontyCarter commented 8 years ago

By the way, I tried starting two calls to benchexec, each with 8 concurrent benchmarks and 15GB memory, but it seems like it puts 1 execution from each call to benchexec into each cgroup. (That is, 8 cgroup containers, each with 2 jobs - one from each call to benchexec).

I'm guessing this is because the cgroups are not uniquely named between calls to benchexec (for example, both calls to benchexec use 8 cgroups, named like "benchgroup1, benchgroup2, ..., benchgroup8", and so they end up going in the same container)?

If this is the case, then using unique cgroup names between benchexec calls would provide the functionality I'm looking for, as opposed to creating an explicit switch to allow oversubscription.

PhilippWendler commented 8 years ago

BenchExec does create unique cgroup names and would never put different runs into the same cgroup. If you can reproduce other behavior, please file a separate bug.

For the original proposal, I think it is technically possible to implement with cgroups. I am not convinced on whether it makes sense, though, if we really abort completely on OOM. This would mean that basically the whole benchmark was wasted.

Another strategy we have implemented in a different system is to start each run with half of its memory limit first, and if it hits the limit, restart this single run with its full limit. This could also be done in BenchExec and has the advantage that it produces exactly the same results as starting all runs with the full limit, but it would significantly complicate the resource assignment and task scheduling in BenchExec.

I won't have time for this year's SV-Comp to implement any of these possibilities.

Note that for your machine, also your number of cores only allows to have 8 parallel runs if you want to use the SV-Comp resource settings (4 cores plus hyper-threading). So if you want more parallel runs, you need to reduce the number of cores, so you could also consider simply reducing the memory limit (and maybe manually execute the OOM results again, or just ignore them).

Running two BenchExec instances in parallel would mean that there is absolutely no possibility to prevent severely skewed results if you do overbooking. The kernel could start swapping any of the runs, or (if you have not enough swap) even start killing runs before they reach their memory limit.

sosy-lab / benchexec

Feature Request - Allow Memory Oversubscription #50