9↑↑5 Will Always Crash (So Let's Talk Practical Security)

MostAwesomeDude commented 8 years ago

(Yes, that is a big stack of labels.)

9↑↑5, or in valid Monte:

9 ** 9 ** 9 ** 9 ** 9

Currently, on my laptop, this slowly but reliably pegs a CPU and chews up all available memory while computing partial results of exponentiation. This is because 9↑↑5 is (relatively) well-known to be not only too large to directly store in a machine, but also generally too large to calculate facts about, so pretty much every arithmetic implementation is going to choke on this.

(Current memory used: 2.2%.)

I want to stress here: Knuth's up-arrows are extensible. 2↑↑↑2, equivalent to Ackermann(4, 2), exists. (Conway's chained arrows are also a thing.) There are many numbers much bigger than this that are legally representable. Any safe eval() implementation needs to be able to handle this input somehow; even if we don't implement it, somebody else will implement it (and usurp the safe eval() market), so it's good to think about it in the meantime.

(Memory used: 7.7%.)

One's first thought might be that we could detect this sort of chicanery. We could try to look for large exponentiations and then throw an exception preëmptively. This would totally work, and block the worst case in primitive operations that we've found. Unfortunately, it doesn't work! Why not? Because the attacker can simply request an exponentiation that is so large that merely trying to guess the size of the result can overflow our memory bounds! This seems ludicrous, but it seems to work on paper, and I don't see any mitigation for it in RPython's bigint implementation.

(Memory: 12.8%.)

We could require a modulus for all exponentiation. After all, pretty much all exponentiation in computing is done relative to either a large smooth/prime/product-of-two-primes-totient-etc. modulus for cryptographic computation, or relative to a power of (a power of) two, for various serialization and ABI work. Since the modulus itself must fit in memory, the attacker would be limited, but not by much; they would simply create the biggest possible modulus, which would be about half the system memory, and then perform their exponentiation modulo that. The resulting pressure on the garbage collector would be immense, and much thrashing would happen. This isn't what we want.

(14.3%. At this point, the GC is accumulating pressure, which is causing allocations to slow as the GC frantically tries to dispose of all the temporary pieces of bigint computation. I wish that I had had the forethought to graph the allocations over time, because I bet that it looks awesome.)

We could have a vat policy that indicates the maximum size of ints. In this scenario, all ints are constrained to be in the ring modulo a certain large number (hopefully, a power of (a power of) two), and all computation implicitly enforces that. With this setup, even bigints could be given size limits, like 4KiB/bigint or whatever the vat wants. This would be a tricky thing to explain to users, because I've never heard of anything like it before. It would also be a bit difficult to implement, although we could basically do it by giving the Typhon class BigInt a vat lookup on creation, and then it calls something like vat.policy.bigintSize() to determine how big it should be.

This is tractable; I've just tested running the following code in a REPL:

▲> def tetrate(a, n) {return if (n == 0) {1} else {a ** tetrate(a, n - 1) % 7}}
Result: <tetrate>
▲> tetrate(9, 5)
Result: 2

That took about 0.581 seconds of user CPU, and negligent memory.

(17.8%, and 44 CPU minutes and counting. 2.18G of memory used and the system's started clearing out space for the troublemaker, slowing down Firefox.)

>>> def tetrate(a, n):
...  return 1 if n == 0 else pow(a, tetrate(a, n - 1), 7)
... 
>>> tetrate(9, 5)
2

Ah, Python gives us the same answer too, and in 0.213 CPU seconds. "But Corbin," you say, "I was hoping for more practical bigints." Okay, let's do this again, in the land of 128-bit unsigned long long long longs (or whatever we're calling them these days):

>>> def tetrate(a, n):
...  return 1 if n == 0 else pow(a, tetrate(a, n - 1), (2 ** 128) - 1)
... 
>>> tetrate(9, 5)
243654990753895559717132964806008173894L

Bug: This example doesn't work on Typhon. The fix shouldn't be hard.

(29.3%. Firefox yielded some ground and Typhon took it quickly.)

Problems with this approach: Well, whomever calls into the vat hopefully knows about the policy! The vat should have some way to express its policies to folks looking to call into the vat. I could imagine some sweet-spot integer size becoming standard, and callers into vats can always perform their computations modulo some well-known modulus in order to be compatible into vats that have a bigger modulus.

To expand on that a bit, suppose two vats have 64-bit and 128-bit ints, respectively. A computation designed for the 64-bit vat could run on the 128-bit vat, without modification, if the vat knows in advance about the incompatibility.

Downsides: Well, you have to pick signed or unsigned. In practice, this means that everybody will pick signed (because nobody expects that -1 == 340282366920938463463374607431768211455), so doing your (mod 2**128) computations with unsigned semantics will require a 129-bit vat. Alternatively, vat policy will indicate both signedness and size. Suddenly everything feels really cruddy. And tastes like C. I'm not so fond of this anymore!

(29.9%.)

I have proposed a library in the past, tentatively called "montepool", that performs the traditional UNIX answer to this problem: Process pooling. Safe eval() is provided via a pool of Typhon subprocesses. The subprocesses are limited by operating-system builtins for resource management. This library is going to be required anyway, for practically dealing with Turing-complete but constant-space nemeses. We might as well rely on it for the technically larger complexity classes.

The main problem that I have with this used to be that this relatively complex library had to be implemented in order to get the desired safety guarantees. But now, looking at the rest of this bug, I think that the policy concerns apply here, too. Vats would have to advertise policies about how much CPU time and memory they have to spend. I know that this is extremely compatible with stuff like Zcash, but it's also still an additional complication that bears consideration.

(29.1%. The computation must have passed some hump and gotten a bunch of space free. At this point, with over an hour of CPU time consumed and nothing to show for it, I think that I've made my point.)

MostAwesomeDude commented 8 years ago

Upon reflection, the Ethereum VM has a 256-bit word size. Their stance on signedness is rather antiquated, unfortunately. (It's also a really unfun-looking compilation target, but I'm sure that somebody will do it at some point.)

MostAwesomeDude commented 8 years ago

Another mitigation, in the realm of vat policy, is to have the exponentiation primitives in the runtime perform a vat checkpoint for every multiplication that they perform. (Which can be estimated in advance with a popcount (or length check) on the bigint holding the exponent.)

monte-language / typhon

9↑↑5 Will Always Crash (So Let's Talk Practical Security) #103