Suggestion: mappable memories

dead-claudia commented 1 year ago

Here's the idea: instead of doing everything dynamically or just everything statically, it could be done somewhere in the middle via a special type of memory called a mappable memory. This is very close to what OSs actually offer, while still being very easily checked and secured.

Mappable memories consist of a set of page IDs, and these can be managed in bulk using start:i32 + count:i32 ranges operating at page resolution.
Instead of a full pair of limits like with linear memory, a single (optional) max page count is specified for the memory and a list of pages to pre-map can be additionally provided. Once this limit is reached, no more mappings can be added until at least one page is unmapped.
Loads and stores to mappable memories need not be aligned any more than is required for linear memories, but they do trap if any part is not mapped.
memory.protect and memory.discard can do as they need.
These mappable memories can also be pre-filled with data, just like unmappable memories, and they can also have ranges pre-set to memory.protect masks to further accelerate loading.
The JS API could just feature a method that retrieves byte ranges that must be mapped on retrieval and cannot be unmapped for as long as the reference persists. (I'll leave it open for discussion how a JS caller would conceptually "free" such a reference.)

Why i32 for page IDs? 2^32 values * 65536-byte pages provides 2^48 possible values, and at the time of writing, AArch64 can only use up to 48 bits total for virtual addresses, offering up to 256 TiB of accessible memory. I expect this to be sufficient for the foreseeable future.

dtig commented 9 months ago

Thanks for posting this, and apologies for the delay in replying. To make sure I understand,

Here's the idea: instead of doing everything dynamically or just everything statically, it could be done somewhere in the middle via a special type of memory called a mappable memory. This is very close to what OSs actually offer, while still being very easily checked and secured.

Mappable memories consist of a set of page IDs, and these can be managed in bulk using start:i32 + count:i32 ranges operating at page resolution. And several mappable memories can be contained within a single linear memory space?

Instead of a full pair of limits like with linear memory, a single (optional) max page count is specified for the memory and a list of pages to pre-map can be additionally provided. Once this limit is reached, no more mappings can be added until at least one page is unmapped.

How are pre-map pages provided? Is the Max page count accounting for mapped pages across different mappable memories?

Loads and stores to mappable memories need not be aligned any more than is required for linear memories, but they do trap if any part is not mapped.

memory.protect and memory.discard can do as they need.

These mappable memories can also be pre-filled with data, just like unmappable memories, and they can also have ranges pre-set to memory.protect masks to further accelerate loading.

The JS API could just feature a method that retrieves byte ranges that must be mapped on retrieval and cannot be unmapped for as long as the reference persists. (I'll leave it open for discussion how a JS caller would conceptually "free" such a reference.)

I guess the safest way to do this would be to expose slices of Wasm memory to be different JSArrayBuffers, though this means that the JSArrayBuffers and Wasm memories would need to be decoupled more than they are right now. That way, it should be straightforward to detach any ArrayBuffers that are backed by memory that has been unmapped.

Why i32 for page IDs? 2^32 values * 65536-byte pages provides 2^48 possible values, and at the time of writing, AArch64 can only use up to 48 bits total for virtual addresses, offering up to 256 TiB of accessible memory. I expect this to be sufficient for the foreseeable future.

dead-claudia commented 9 months ago

Thanks for posting this, and apologies for the delay in replying. To make sure I understand,

Instead of a full pair of limits like with linear memory, a single (optional) max page count is specified for the memory and a list of pages to pre-map can be additionally provided. Once this limit is reached, no more mappings can be added until at least one page is unmapped.

How are pre-map pages provided?

I could imagine a system similar to what's currently done for linear memory.

I had not at the time of writing that up (and still don't) had a concrete answer for that question - I was trying to explain the structure at a high level to draw snd gauge interest with my initial comment, not offer a complete competing proposal.

Is the Max page count accounting for mapped pages across different mappable memories?

The limit would be per-memory. A global max could be added, but shouldn't be required.

I guess the safest way to do this would be to expose slices of Wasm memory to be different JSArrayBuffers, though this means that the JSArrayBuffers and Wasm memories would need to be decoupled more than they are right now. That way, it should be straightforward to detach any ArrayBuffers that are backed by memory that has been unmapped.

To be clear, I said "could" not "should", and there may be better ways of doing it.

akirilov-arm commented 9 months ago

... and at the time of writing, AArch64 can only use up to 48 bits total for virtual addresses, offering up to 256 TiB of accessible memory.

Technically AArch64 has supported 52 bits for a while now, as provided by the FEAT_LVA feature of the Armv8.2 architecture extension.

dead-claudia commented 9 months ago

... and at the time of writing, AArch64 can only use up to 48 bits total for virtual addresses, offering up to 256 TiB of accessible memory.

Technically AArch64 has supported 52 bits for a while now, as provided by the FEAT_LVA feature of the Armv8.2 architecture extension.

Is that all user-addressable? Or is it only kernel-addressable like in x86?

Edit: Also, x86 has that 2^48 user-mode limit also.

akirilov-arm commented 9 months ago

Is that all user-addressable?

Yes, it is. Here is how it works on Linux, for instance.

dead-claudia commented 9 months ago

Is that all user-addressable?

Yes, it is. Here is how it works on Linux, for instance.

@akirilov-arm Seems only for some devices, given it's explicity optional. Makes this feel unportable.

Also, 2^48 bytes = 256 TiB. Something worth keeping in context - it's not a small number. And worst case scenario, if that does prove too low of a limit, it'd be easy enough to extend to fill out the 64-bit space fully. (I doubt this will happen anytime soon, but history has had a knack for laughing at us people daring to make such predictions.)

akirilov-arm commented 9 months ago

@akirilov-arm Seems only for some devices, given it's explicity optional.

Yes, it is an optional feature. However, I am not sure that I understand the portability concerns, given that a similar situation exists even in the Wasm MVP - there is no guarantee that the Wasm runtime would be able to provide the whole 4 GiB memory range that is covered by a 32-bit index.

Anyway, my point is not to argue whether 48-bit memory spaces are big enough or not, but merely to point out that one of the assumptions in your suggestion is not entirely correct.

WebAssembly / memory-control

Suggestion: mappable memories #4