cc-tweaked / Cobalt

A re-entrant fork of LuaJ
Other
72 stars 13 forks source link

Memory limits via allocation sampling #66

Open SquidDev opened 1 year ago

SquidDev commented 1 year ago

One of Cobalt's weaker points is that it does not impose any limits on the amount of memory the VM can use. Ideally CC: Tweaked would switch over to a more native-style VM which does support this (see https://github.com/cc-tweaked/CC-Tweaked/issues/769), but I think that's a long way away.

Unfortunately, it is impractical to track every single allocation - this would make the implementation significantly more complex, and incur a massive overhead.

One alternative idea, inspired by this OCaml package (though perhaps obvious in retrospect) is to monitor a small sample of our allocations, and estimate actual memory usage based on those. To further simplify things, I propose we only track array allocations: memory usage will be higher than our estimate, but it should still be bounded by some constant factor (2-3x?).

Implementation

Concerns

The main concern here is this is heavily tied to Java's GC. It's possible the Lua VM could no longer hold a reference to a large object, but the GC hasn't got to it yet, so the currentMemory is still large.

It might be safer to set the max memory to something arbitrarily high (1GiB?) and expose the memory usage via a metric. This way we can get a better idea of the current behaviour before doing anything more drastic.

SquidDev commented 1 year ago

An alternative approach would be to use JVMTI's allocation sampling to handle it for us. I think this is probably a bad idea (it requires native code, and even if we just enable it for CC's threads, I'm not sure what performance it has on the whole VM[^1]), but worth mentioning at least.

[^1]: The JEP mentions it's a 1% performance overhead with an empty handler, increasing to 3% for something which tracks each allocated object. Which isn't massive, but across the whole of Minecraft would be unacceptable!

SquidDev commented 1 year ago

One issue which has only just occurred to me, is that while allocation sampling is fine for monitoring "normal" computers, it's not safe against adversarial attacks.

The main issue is that not all allocations are tracked (which is intentional, as otherwise this would come with massive overhead!). This means that if you have an oracle to detect whether an allocation was tracked or not, you can retain references to non-tracked objects and drop references to tracked ones. This means your tracked memory remains constant, but actual memory continues to grow!

There are some obvious oracles (collectgarbage("count")) which we could block, but more problematically receiving a "out of memory" error is itself an indicator that the object was probably tracked. This means we could write a program like follows:

local tbl = {}

-- Fill tbl with 0s until we OOM
local i = 1
while pcall(function() tbl[i] = 0 end) do i = i + 1 end

-- Now fill tbl with untracked strings.
local i = 1
while true do
  local ok, res = pcall(string.rep, " ", 1024)
  if ok then
    -- Allocation wasn't tracked, keep a reference to it
    tbl[i] = i
    i = i + 1
  else
    -- Out of memory, try again
  end
end

I honestly don't know if there's a good solution to this :/.

SquidDev commented 1 year ago

I think the best solution for now is probably to not impose any memory limits at all (thus removing the oracle) and just provide monitoring tools on the CC:T side.