near / nearcore

Reference client for NEAR Protocol
https://near.org
GNU General Public License v3.0
2.31k stars 613 forks source link

near-vm: Ideas around removing memory checks (without signals!) #8952

Open nagisa opened 2 years ago

nagisa commented 2 years ago

In https://github.com/near/nearcore/issues/8954 one of the regressing changes that has been identified was around removal of signal based memory access traps. This issue describes an (admittedly silly) idea that would allow removing the memory access checks without re-introducing the signal handlers and the associated complexity.

The WebAssembly specification allows indexing memory with a 32-bit offset, and any accesses outside of the allocated memory region must trap.

The idea revolves around providing all WebAssembly modules 0x1_0000_0000 bytes of memory to work with, thus making all accesses in-bounds of the memory region. A naive solution would be to simply supply a memory region of 0x1_0000_0000 bytes. That said, in order to ensure contract execution is reproducible, and also doesn't have access to state from other contract runs, we must ensure that each contract gets zeroed-out memory region to work with. Alas, a system with 8 channels of DDR4 memory will have somewhere around 300GB/s of memory bandwidth in practice. This means that even if our workload was entirely zeroing out the contract memory, we'd be able to do it at most 75 times every second. That seems like quite a harsh limitation on the number of function calls we can execute.

An alternative approach would be to provide the WebAssembly modules with 4GB virtual memory region, all of which is accessible, but where certain address ranges map to the same area in physical memory. For example, consider we want to provide contracts 32MiB of actual memory to work with. We'd construct a virtual memory region where addresses 0..32*1024*1024 would map to a 32MiB block in physical memory, and then 32*1024*1024..64*1024*1024 would map to exactly the same physical memory region again. That way clearing just 32MiB of memory would ensure that all 4GiB of virtual memory visible to the contract is zeroed out. The downside is that from the perspective of the contract author, there's less of a hand holding from the runtime. There'd be nothing stopping them from overrunning their buffer(s) and hitting additional memory internal to the contract.

(There have been some other ideas on how exactly the virtual memory could map to physical memory to make it more palatable, etc. but ultimately they don't really change much in the grand scheme of things)

This idea is largely inspired by micro-controllers which almost universally apply the same idea in my experience. I imagine its done because it simplifies the memory controller design, and also because it solves a problem of how to deal with modern programming languages declaring address 0 invalid.

matklad commented 2 years ago

Do I understand correctly that these approaches won't allow us to implement the spec? That is, that the spec requires to trap on out of bounds access, and the approaches above won't trap?

nagisa commented 2 years ago

The idea is that all accesses would become in-bounds, since the modules would have access to a memory instance that's 4GB large. The memory itself wouldn't be linear anymore, though, which possibly violates another part of the spec.

nagisa commented 2 years ago

I have also considered approaches where we'd zero out pages on demand, as they are faulted in (just like Linux approaches this). However, in the worst case a contract could just touch all the pages in order, so each memory access instruction would need to charge enough gas to zero-out 4KiB of memory.

matklad commented 2 years ago

If we want to make memory non-linear, a middle ground option would be to mask addresses.

nagisa commented 2 years ago

This is what this idea would effectively implement, yes. It is probably more efficient to utilize MMU, than to go through execution units for masking, especially since MMU is going to be used regardless. The MMU based solution also benefits from the fact that the memory provided to the contract doesn't strictly need to be a power-of-two sized.