ewasm / design

Ewasm Design Overview and Specification
Apache License 2.0
1.02k stars 125 forks source link

Linear storage design #188

Open axic opened 5 years ago

axic commented 5 years ago

Memory-mapped linear storage in Ethereum was proposed IIRC by @AlexeyAkhunov. Here we give a potential implementation of that in ewasm.

The proposal is that instead of fixed-length/fixed-key storage values, storage is linear, extendable and accessed like memory. We can take a design cue from POSIX: mmap.

API: storageMap(memoryOffset: i32, storageOffset: i32, storageLength: i32)

In the naive implementation this would:

In an optimised implementation with control over the VM could, while still requiring storageMap, load bytes / words / etc. on-demand at the first time the area is accessed and keep track of changed areas.

In the future when Wasm support multiple memory segments, instead of mapping into the "main" working memory, storage could be mapping 1-to-1 in a dedicated segment, without the need of a host function.

cdetrio commented 5 years ago

If the storage isn't written until execution terminates, what will happen on re-entry (the contract instance calls another contract, which calls back)?

axic commented 5 years ago

I think that is the question where we'll find all the subtle details and complications.

I'd say the practical decision right now is that storage in synchronised at the point of a call, e.g. the storage is written out prior to executing call and is loaded after call finishes.

jakelang commented 5 years ago

Proposal: Storage Modules

In order to handle the case of re-entrancy, we need the storage mappings (or at least info about revisions to storage) to persist across calls.

With storage modules, we have a scheme where storage is mapped into the memory of a different module than that of the executing contract.

The storage module exists from instantiation to the end of the transaction (top-level call).

Depending on the feasibility of dynamic Wasm module linking, storage modules are instantiated on first call to storageMap or on first instantiation of the corresponding contract.

A "storage module" consists of one or more linear memories, which are exported (depending on which option is used below). A running contract which maps storage to memory will import an exported memory from this module.

Options

Here are a few possible variants of the storage module scheme.

API

API will remain roughly the same. However, with the last option, I/O methods will be exposed to the contract which can trap on invalid memory access.

Questions

jakelang commented 5 years ago

Proposal 2: Storage Paging (ROUGH(!!) DRAFT)

This is an alternative memory mapping proposal which avoids the complexities of persistent wasm modules as storage while retaining the benefits of memory-mapped access.

In this scheme, the VM around the executing module (called "host" here) maintains a data structure which persists for the lifetime of the transaction, and keeps track of mapped storage units. In operating systems parlance, this is known as the frame table.

Furthermore, each instance of a Wasm module (representing a contract call) is associated with its own table of mappings from memory offsets within the module to elements of the frame table. Such a structure is better known as a page table.

This scheme can be thought of as a rudimentary paging system intended solely for mmap-style operations, where data is mapped directly from storage into a virtual memory space. It is also adaptable to the capabilities model if we do not allow multiple virtual pages for the same frame.

The Frame Table

The frame table keeps track of mapped "frames", which are fixed-size contiguous regions of linear storage.

It is maintained by the host and persists until the end of the transaction.

Because each contract has its own linear storage, we also include the address of the contract in the frame.

The index of the frame in the frame table can then be referenced by elements of a module instance's associated page table.

FrameTable ::= vec(address: bytes20, pagenum: u64)

The Page Table

Every instance of a Wasm module has its own page table. Each page corresponds to an element of the frame table.

The page table is initialized on first call to storageMap and lives until the module returns.

PageTable ::= vec(offset: u32, frame: usize)

Memory mappings

When a contract calls storageMap(storage_offset: u64, num: u32, dest: u32), the VM will interrupt and do the following:

When a Wasm module returns or calls another Wasm contract, the changes to memory are checked. If the page was mutated, write the page to the corresponding frame. If we encounter a page fault, then trap.

Questions