sebastienros / jint

Javascript Interpreter for .NET
BSD 2-Clause "Simplified" License
4.04k stars 558 forks source link

Memory limits bypass #887

Open sebastienros opened 3 years ago

sebastienros commented 3 years ago
var arr = Array(1000000000).fill(new Array(1000000000));

I think it's fine to not check the actual memory usage of each array or string allocations though. These could use an option with a max length instead, even very big. Then throw a specific exception. And concatenations would check upon this limit instead of the thread memory.

The fill function however might need to check the memory limits. There might be other functions too that loop over the same value and need a check. Looping over a function evaluation is safe though since thread memory is ensured.

lahma commented 3 years ago

MaxArraySize constraint in #923 should address this.

Ghostbird commented 2 years ago

This issue isn't fixed at all! This very simple line still fills the system memory in an instant:

JSON.stringify(new Array(99999).fill(0).map(i => new Array(99999).fill(0).map(j => new Array(99999).fill(0))));

Note: Array size limit is 99999, but even a way lower limit wouldn't protect against this kind of code. image

lahma commented 2 years ago

Would you like to submit a PR to improve the protections?

Ghostbird commented 2 years ago

I have zero experience with the Jint code. Do you have a suggestion what kind of solution would be appropriate? I'm guessing the reason this isn't working properly is that a really fundamental solution (e.g. check memory usage at every instantiation of a JS value/object) would cause an unacceptable performance hit.

lahma commented 2 years ago

Start by creating a failing test case when you have engine configured to have array and memory usage constraint options.LimitMemory(4_000_000);. Jint will only check for constraints when exiting native functions like ArrayPrototype.map so you need to have memory usage constraint as long as you only create array sizes that are allowed and hope that native functions won't be able to exhaust memory inside the calls.

sebastienros commented 2 years ago

What about checking for the constraint inside native functions if at they represent a risk? We could set a flag and the callers could also check? For instance map() doesn't do the check by default because it's a native function, whereas a for loop will do it because it's based on statements. The map() itself is not an issue, but because the called func is allocating, it might need to check for the limit on each invocation. For Array it could check only if the size is above some random threshold, like 1KB, but always set a flag such that the caller, map() should be checked.

But maybe a better idea would be to have an internal field that track an estimation of allocated memory in native functions, a hint, and above a threshold do a constraint check. So for every Array() call we increment by the estimated allocated size, so the map() can just check that counter, which is cheap, and eventually do a real Thread memory check. Same for string, and we could even do this cheap/local memory tracking for standard instances like numbers. Could be an API on the engine. It could sound like too much, but I think it's a huge benefit. Thoughts?

lahma commented 2 years ago

I had a vision where native functions would get a CancellationToken as part of call signature which would be cheap to check.

Constrainst could hook onto that that like as combined source and constraints could even run a background task (think thread), a watchdog, that could signal token when things start to look bad.

Just an idea.

sebastienros commented 2 years ago

in the Engine then, doesn't need to be in the signature

sebastienros commented 2 years ago

But it's orthogonal to my idea. What about it?

lahma commented 2 years ago

Yes the signature was more about aligning with standards, but could be a field too.

sebastienros commented 2 years ago

Also wondering if we should track the size of each environment record, such that when one is remove we can decrease the allocations. Imagine a long running script that just updates the same entry with a 1MB buffer, it allocates a lot overtime, but only takes 1MB in total. So there would be two checks, that what we allocate doesn't go over the limit, and only account for retained memory when adding the allocated buffers.