latticexyz / mud

MUD is a framework for building autonomous worlds
https://mud.dev
MIT License
714 stars 178 forks source link

Exploration: `VIRTUALIZED` File type #457

Open ludns opened 1 year ago

ludns commented 1 year ago

Introduction

Context: #456 (Filesystem extension) This proposal motivates the introduction of a new type of executable data in the World's filesystem beyond SYSTEM, along with a Hypervisor. It is an exploration: in case core devs are interested in this, we should put together a PoC to verify our assumptions.

What is a hypervisor?

(from HyVM) According to vmware

A hypervisor, also known as a virtual machine monitor or VMM, is software that creates and runs virtual machines (VMs). A hypervisor allows one host computer to support multiple guest VMs by virtually sharing its resources, such as memory and processing.

Why do we need a hypervisor?

Currently, the World framework is similar to an OS: It implements "Syscalls" in the form of mediating writing to Tables from Systems. It also implements a rudimentary filesystem (whose proposed future behavior and look is described in #456). In order to sandbox system calls, the World executes Systems in a different call frame using a CALL. Writes to storage are executed in the context of the World using the code-generated Table libraries and StoreSwitch, which calls Syscalls on the World (currently the list of Syscalls are: setRecord, setField, deleteRecord).
This back and forth requires 2x CALL per write and read (one for checking whether the System was DELEGATECALLed or not, and one to make the write / receive the data in a read). Additionally, useful extensions to the EVM like EIP-1153 can drastically simplify and make cheaper System to System call context need to first go through implementation in all core clients and tools to be usable. With a Hypervisor, we can choose to support these use-cases immediately by implementing them in "our" VM. These new op-codes can also be supported in Solidity using literalbytecode, along with a possibly more efficient form of Sandboxing that doesn't require new call frames and 2x CALL per write / read.

Proposal

I propose executing certain systems in a Hypervisor by first loading their bytecode using extcodecopy to copy the bytecode in memory, then running an EVM in EVM hypervisor (eg: HyVM). Hypervisors for EVM are quite gas-efficient, and they allow us to execute hooks on dangerous op-codes instead of using callframes for sandboxing. In our case, CALL / DELEGATECALL / SSTORE / CREATE2 would need to be inspected.

  • CALL: when a CALL happens, check if the address called by the current System is a System, and if it is make sure it can be called by the caller. Load the code of the called System and continue execution.
  • DELEGATECALL: block (could be TBD)
  • SSTORE: inspect the storage slot and revert if it belongs to a table the system doesn't have access to. This is tricky because all keys are hashed, one possibility is by masking all keys in the low level Store implementation to make sure the first X bits of the key can be turned into a table id. This will require some change in the low level implementation of the Store.
  • CREATE2: TBD

This new type of system will be implemented as a new file type (see #456 for context): VIRTUALIZED:

alvrs commented 1 year ago

Very interesting idea! Since this can be an extension and is not blocked by low level decisions (if we allow for more file types like discussed in #456), I propose verifying the assumptions on feasibility and gas cost after the foundation of v2 is done (and keep focussing on the v2 foundation for now).