Open dtig opened 2 years ago
I really like this proposal and I am glad it is happening now! 👍
One additional use case I can think of is to implement the equivalent of .ro
sections in ELF, which are read-only data. We could consider an extension to active memory segments and an extension to memory declarations to declare read-only ranges and segments to be loaded into those read-only ranges, prior to the start function, so that memory is not ever observably uninitialized.
How does this work in the presence of multiple threads?
In applications that use multiple threads, what calls are guaranteed to be atomic? On the JS side, what guarantees can we provide for Typed array views?
AFAIK these operations (if implemented via POSIX) can't be guaranteed to be atomic unless we're willing to do something like pause/interrupt every other thread (which can access the memory) while carrying them out. My understanding is that the POSIX spec just says that races here have undefined behaviour.
If stopping the world isn't acceptable, we might be able to get away with something similar to our current memory.grow
semantics in the case of a race, where a thread's individual memory accesses may each non-deterministically observe/not observe any concurrent (racy) mmap
/mprotect
, unless there is some other synchronisation (e.g. through a paired atomic read-write) which fixes whether the operation is visible or not. This is beyond what the POSIX spec guarantees, but might be satisfied by real OS behaviours (and is probably good enough for real user programs). AFAIU this is a rather underexplored area semantically.
I really like this proposal and I am glad it is happening now! 👍
One additional use case I can think of is to implement the equivalent of
.ro
sections in ELF, which are read-only data. We could consider an extension to active memory segments and an extension to memory declarations to declare read-only ranges and segments to be loaded into those read-only ranges, prior to the start function, so that memory is not ever observably uninitialized.
Thanks @titzer! Interesting use case.
If stopping the world isn't acceptable, we might be able to get away with something similar to our current memory.grow semantics in the case of a race, where a thread's individual memory accesses may each non-deterministically observe/not observe any concurrent (racy) mmap/mprotect, unless there is some other synchronisation (e.g. through a paired atomic read-write) which fixes whether the operation is visible or not. This is beyond what the POSIX spec guarantees, but might be satisfied by real OS behaviours (and is probably good enough for real user programs). AFAIU this is a rather underexplored area semantically.
I'm hoping that we will be able to get away with the current memory.grow semantics. ASFAIK, we haven't encountered issues in the wild with racy grow calls, though I expect that it is more observable with an mmap
call.
Exciting!
memory.map
: Provide the functionality ofmmap(addr, length, PROT_READ|PROT_WRITE, MAP_FIXED, fd)
on POSIX, andMapViewOfFile
on Windows with accessFILE_MAP_READ
/FILE_MAP_WRITE
.
What are you imagining the operands to the memory.map
instruction would be? Core Wasm doesn't have file descriptors or handles, but WASI and Web APIs do have analogous concepts, so this file-mapping functionality seems more appropriate for WASI and/or Web APIs than for core Wasm instructions in my mind.
memory.discard
: Provide the functionality ofmadvise(MADV_DONTNEED)
andVirtualFree(MEM_DECOMMIT);VirtualAlloc(MEM_COMMIT)
on windows.
To be clear, the intended semantics for memory.discard
is to zero the given memory region, correct?
I ask only because accessing pages after madvise(MADV_DONTNEED)
doesn't always give zero pages: if the memory region is a shared mapping to an underlying file, then subsequent accesses will repopulate pages from that underlying file instead of using zero pages. It isn't 100% clear to me whether it is intended for memory.discard
to have this behavior as well.
Is the expectation that Wasm engines running in environments without virtual memory will simply not implement or disable this proposal?
Just double checking: these instructions would all require that the memory regions they operate upon be page aligned and multiple-of-page-size sized, right?
I suppose they could take their operands in units of pages, rather than bytes, to enforce this, similar to memory.grow
.
Overall:
I like the idea of exposing virtual memory and protection functionality via memory.protect
and memory.discard
instructions in core Wasm.
I think that the file mapping functionality should probably be built on top of the new core Wasm virtual memory functionality in either the WASI and/or Web API layers (instead of inside core Wasm; i.e. there should not be memory.{map,unmap}
core Wasm instructions).
I like "option 1" of having static memory immediates for the new instructions, rather than introducing memory references. It is easier to start with, and we can always introduce memory references and indirect versions of these new instructions (and loads/stores/memory.copy
/etc) that operate on memory references at a later time, if needed, which would be analogous to how we have both call
and call_indirect
instructions.
I generally agree with @fitzgen here. I think we should put first-class memories into their own proposal; we'll have to design a way to allow all memory operations that currently have static memory indexes to take a first class memory, and that mechanism should probably be uniform.
I also agree that file mapping is probably best handled at a different layer, so I think it may be out of scope here too.
Exciting!
memory.map
: Provide the functionality ofmmap(addr, length, PROT_READ|PROT_WRITE, MAP_FIXED, fd)
on POSIX, andMapViewOfFile
on Windows with accessFILE_MAP_READ
/FILE_MAP_WRITE
.What are you imagining the operands to the
memory.map
instruction would be? Core Wasm doesn't have file descriptors or handles, but WASI and Web APIs do have analogous concepts, so this file-mapping functionality seems more appropriate for WASI and/or Web APIs than for core Wasm instructions in my mind.
Thanks @fitzgen for the detailed feedback.
I'll start with an example here to clearly scope the problem that I'd like to tackle. Let's say WebGPU maps a GPUBuffer that produces an ArrayBuffer, or an ArrayBuffer is populated as the result of using file handling APIs like Blob.arrayBuffer()
/FileReader.readAsArrayBuffer()
, the contents of this ArrayBuffer need to be directly accessible to a Wasm module to avoid copying in/out of the Wasm linear memory.
While I also agree that file descriptors are out of place here, I don't necessarily agree that a map
instruction is out of place as a core Wasm instruction. In my mental model, I expect that if it is possible for a Wasm module to have additional memory to operate on, that action should be explicit in the form of a core instruction, i.e. a file mapping API at a different layer would still need a core Wasm instruction that would be called. I'm having trouble thinking through how this would work if the functionality was only provided at a different layer. The linear memory still needs to be defined in a module, or imported into a module, how would this be accessible inside Wasm?
I think the operands to memory.map
(just in the context of Option 2) would be as follows:
index
: which specifies the memory indexpages
: length in number of pagesprot
: Bit field for write protections, this may be extended to include whether a memory can be grown.addr
: Pointer to the backing store of an ArrayBuffer for the above example (Not the best name because it confuses mmap
arguments. I'm also unfamiliar with what would work for WASI in this case, but happy to look into it more to generalize this better)
memory.discard
: Provide the functionality ofmadvise(MADV_DONTNEED)
andVirtualFree(MEM_DECOMMIT);VirtualAlloc(MEM_COMMIT)
on windows.To be clear, the intended semantics for
memory.discard
is to zero the given memory region, correct?I ask only because accessing pages after
madvise(MADV_DONTNEED)
doesn't always give zero pages: if the memory region is a shared mapping to an underlying file, then subsequent accesses will repopulate pages from that underlying file instead of using zero pages. It isn't 100% clear to me whether it is intended formemory.discard
to have this behavior as well.
The intended behavior is to zero the memory pages, I'll look into potential options some more.
Is the expectation that Wasm engines running in environments without virtual memory will simply not implement or disable this proposal?
Yes, though I expect that it would be possible to polyfill if needed. I'm not sure that that would be particularly useful.
Just double checking: these instructions would all require that the memory regions they operate upon be page aligned and multiple-of-page-size sized, right?
I suppose they could take their operands in units of pages, rather than bytes, to enforce this, similar to
memory.grow
.
Yes, my expectation is that all operands are in units of pages consistent with the memory.grow
operation.
Overall:
- I like the idea of exposing virtual memory and protection functionality via
memory.protect
andmemory.discard
instructions in core Wasm.- I think that the file mapping functionality should probably be built on top of the new core Wasm virtual memory functionality in either the WASI and/or Web API layers (instead of inside core Wasm; i.e. there should not be
memory.{map,unmap}
core Wasm instructions).- I like "option 1" of having static memory immediates for the new instructions, rather than introducing memory references. It is easier to start with, and we can always introduce memory references and indirect versions of these new instructions (and loads/stores/
memory.copy
/etc) that operate on memory references at a later time, if needed, which would be analogous to how we have bothcall
andcall_indirect
instructions.
I'll start with an example here to clearly scope the problem that I'd like to tackle. Let's say WebGPU maps a GPUBuffer that produces an ArrayBuffer, or an ArrayBuffer is populated as the result of using file handling APIs like
Blob.arrayBuffer()
/FileReader.readAsArrayBuffer()
, the contents of this ArrayBuffer need to be directly accessible to a Wasm module to avoid copying in/out of the Wasm linear memory.
Agreed that this use case is very motivating.
I think the operands to
memory.map
(just in the context of Option 2) would be as follows:* `index`: which specifies the memory index * `pages`: length in number of pages * `prot`: Bit field for write protections, this may be extended to include whether a memory can be grown. * `addr`: Pointer to the backing store of an ArrayBuffer for the above example (Not the best name because it confuses `mmap` arguments. I'm also unfamiliar with what would work for WASI in this case, but happy to look into it more to generalize this better)
What is the representation of a pointer to the backing store of an ArrayBuffer
here? Is it an externref
(or some other kind of reference) that JS passes in? Is it an integer indexing into some table maintained on the JS side of things? How does core Wasm get/create one?
It seems to me like this API/functionality fundamentally involves communicating with, and making assumptions about, the host. Therefore this belongs in WASI/Web APIs, not core Wasm, in my mind.
I'm having trouble thinking through how this would work if the functionality was only provided at a different layer. The linear memory still needs to be defined in a module, or imported into a module, how would this be accessible inside Wasm?
What I am imagining is that there would be a JS API basically identical to what you've described for the memory.map
instruction, but because it is a JS API it can just take an ArrayBuffer
as its addr
/file descriptor/handle argument directly and side step the questions raised above.
Something like this:
// Grab the Wasm memory.
let memory = myWasmMemory();
// Grab the array buffer we want to share with Wasm.
let buffer = myArrayBuffer();
// Length of the buffer, in Wasm pages.
let page_len = Math.ceil(buffer.length / 65536);
// The memory protections.
let prot = WebAssembly.Memory.PROT_READ | WebAssembly.Memory.PROT_WRITE;
// Map the array buffer into this memory!
memory.map(page_len, prot, buffer);
Then, if you wanted to create a new mapping from inside Wasm, you would import a function that allowed you to have your own scheme for identifying which array buffer you wanted to map (maybe coming up with your own "file descriptor" concept, since you can safely make assumptions about the host and include your own JS glue to maintain the fd-to-ArrayBuffer
mapping on the JS side).
The linear memory would still be defined inside Wasm, as if it were just another memory. And it would be just another memory, until the JS API was called on it and the array buffer got mapped in.
There could be an analogous API for WASI. (Although, at the risk of going into the weeds a little bit here, one of WASI's goals is for all APIs to be virtualizable, and this API wouldn't be. Making it virtualizable would require a memory.map
instruction were you could overlay views of an existing memory onto another memory. That is a bit more powerful than anything we've been talking about in this thread thus far.)
This was also previously discussed as an addition to the MVP, and more recently as an option for better memory management.
The 2 links you used there are the same URL. Did you mean for the latter one to be https://github.com/WebAssembly/design/issues/1397 ?
I'll start with an example here to clearly scope the problem that I'd like to tackle. Let's say WebGPU maps a GPUBuffer that produces an ArrayBuffer, or an ArrayBuffer is populated as the result of using file handling APIs like
Blob.arrayBuffer()
/FileReader.readAsArrayBuffer()
, the contents of this ArrayBuffer need to be directly accessible to a Wasm module to avoid copying in/out of the Wasm linear memory.Agreed that this use case is very motivating.
I think the operands to
memory.map
(just in the context of Option 2) would be as follows:* `index`: which specifies the memory index * `pages`: length in number of pages * `prot`: Bit field for write protections, this may be extended to include whether a memory can be grown. * `addr`: Pointer to the backing store of an ArrayBuffer for the above example (Not the best name because it confuses `mmap` arguments. I'm also unfamiliar with what would work for WASI in this case, but happy to look into it more to generalize this better)
What is the representation of a pointer to the backing store of an
ArrayBuffer
here? Is it anexternref
(or some other kind of reference) that JS passes in? Is it an integer indexing into some table maintained on the JS side of things? How does core Wasm get/create one?It seems to me like this API/functionality fundamentally involves communicating with, and making assumptions about, the host. Therefore this belongs in WASI/Web APIs, not core Wasm, in my mind.
For Option 1, I expect this to be an externref
, for Option 1, though this is more flexible, i.e. if we did introduce the concept of a generic memoryref
, then I expect that there would be a table on the Wasm side, and we would need additional instructions to manipulate memory references.
I'm having trouble thinking through how this would work if the functionality was only provided at a different layer. The linear memory still needs to be defined in a module, or imported into a module, how would this be accessible inside Wasm?
What I am imagining is that there would be a JS API basically identical to what you've described for the
memory.map
instruction, but because it is a JS API it can just take anArrayBuffer
as itsaddr
/file descriptor/handle argument directly and side step the questions raised above.Something like this:
// Grab the Wasm memory. let memory = myWasmMemory(); // Grab the array buffer we want to share with Wasm. let buffer = myArrayBuffer(); // Length of the buffer, in Wasm pages. let page_len = Math.ceil(buffer.length / 65536); // The memory protections. let prot = WebAssembly.Memory.PROT_READ | WebAssembly.Memory.PROT_WRITE; // Map the array buffer into this memory! memory.map(page_len, prot, buffer);
Then, if you wanted to create a new mapping from inside Wasm, you would import a function that allowed you to have your own scheme for identifying which array buffer you wanted to map (maybe coming up with your own "file descriptor" concept, since you can safely make assumptions about the host and include your own JS glue to maintain the fd-to-
ArrayBuffer
mapping on the JS side).The linear memory would still be defined inside Wasm, as if it were just another memory. And it would be just another memory, until the JS API was called on it and the array buffer got mapped in.
There could be an analogous API for WASI. (Although, at the risk of going into the weeds a little bit here, one of WASI's goals is for all APIs to be virtualizable, and this API wouldn't be. Making it virtualizable would require a
memory.map
instruction were you could overlay views of an existing memory onto another memory. That is a bit more powerful than anything we've been talking about in this thread thus far.)
Ah, I see what you mean. My intent with proposing core Wasm instructions for map
/unmap
was to see if there's a way to make the module having access to additional memory more explicit, instead of implicit through the API. But I do agree with you that trying to do so does make assumptions about the host environment. If the current use is limited to the JS/Web use case of being able to map ArrayBuffers in , I would not be opposed to starting with a API-only function, and revisit the addition of core Wasm instructions if needed (I touch on this in the 3rd bullet point of Option 1, but on re-reading I realize that doesn't provide sufficient detail).
This was also previously discussed as an addition to the MVP, and more recently as an option for better memory management.
The 2 links you used there are the same URL. Did you mean for the latter one to be #1397 ?
I did! Thanks for catching, I've updated the OP.
Ah, I see what you mean. My intent with proposing core Wasm instructions for map/unmap was to see if there's a way to make the module having access to additional memory more explicit, instead of implicit through the API. But I do agree with you that trying to do so does make assumptions about the host environment. If the current use is limited to the JS/Web use case of being able to map ArrayBuffers in , I would not be opposed to starting with a API-only function, and revisit the addition of core Wasm instructions if needed
There could be some analogy here to the way we currently think about thread creation. The core Wasm spec could describe how instructions interact with a "mapped memory" (cf. "shared memory"), without specifying core Wasm instructions for creating/altering the mapping (at least as an MVP). Web environments would want a host API to create mappings to ArrayBuffer
, while non-Web environments might want a host API that creates mappings based on (e.g.) WASI file handles. So even if the current use-cases aren't restricted to just JS/the Web, an API-first approach could be viable.
There could be some analogy here to the way we currently think about thread creation. The core Wasm spec could describe how instructions interact with a "mapped memory", without specifying core Wasm instructions for creating/altering the mapping (at least as an MVP). Web environments would want a host API to create mappings to
ArrayBuffer
, while non-Web environments might want a host API that creates mappings based on (e.g.) WASI file handles. So even if the current use-cases aren't restricted to just JS/the Web, an API-first approach could be viable.
Yes, exactly. Thank you for stating this so succinctly!
Somewhat related: discussion on address space related features in Memory64: https://github.com/WebAssembly/memory64/issues/4
I almost certainly do not understand the limitations that browsers are subject to w.r.t. memories that make it necessary to implement mmap functionality in terms of multi-memory (as opposed to being addressable by a single linear memory pointer), but I do feel this is unfortunate. I foresee lots of use cases in programming languages and other systems that would not work without a single address space, or without languages like C/C++ being able to use regular pointers to address all of it.
And if languages like C/C++ can't natively write to it but would need intrinsics/library functions to emulate access to a secondary memory (which would not allow reuse of buffer creation code in those languages), then there would be no use implement it with multi-memory underneath. Likely code in those languages would need to copy things anyway, in which case a memcpy
with an extra memory argument would suffice.
Generalizing what this would need to look like, we need to store granular page level details for the memory which complicates the engine implementations
Why would that be required? To me, the biggest issue with features indicated in the above discussion would be what happens if the system is unable to commit a a page (assuming they were reserved without guaranteeing physical/page file availability). But assuming that can be solved, actual access should be possible with existing load/store ops without further information?
To me, the biggest issue with features indicated in the above discussion would be what happens if the system is unable to commit a a page (assuming they were reserved without guaranteeing physical/page file availability).
AFAICT this can already happen with a large memory.grow
operation on most engines, which typically reserve (32-bit) memories and change protections upon grow. The underlying OS demand-pages these mappings and technically could go OOM on any memory access, even on pages that were previously mapped if it's swapped them to disk and memory is no longer available.
@aardappel
I almost certainly do not understand the limitations that browsers are subject to w.r.t. memories that make it necessary to implement mmap functionality in terms of multi-memory (as opposed to being addressable by a single linear memory pointer),
Although I do see that there are some implementation challenges I basically agree with this, and I think we should explore the design and implementation spaces for the VM functions in the context of memory zero before assuming that it is absolutely necessary to go multi-memory.
Multi-memory has uncertain utility in unified address space languages, and the present proposal seems even more oriented toward the classical languages that are the most tied to unified address spaces than is the multi-memory proposal itself. For the present proposal there is therefore a heavy burden on the champions to demonstrate that tools that people will want to use can be applied effectively in a multi-memory setting.
IIUC the proposal to forbid these operations on the default memory was motivated by a desire to avoid impacting the performance of modules not using these features. Could this instead be accomplished by making a type-level distinction between "mappable" and "non-mappable" memories (again, akin to the current distinction between "shared" and "non-shared")?
In this case, there would be no issue with declaring the default memory of newly-generated modules as "mappable" if required, although there might be some compositionality issues with previously-generated modules.
Certainly an attribute could be made to work to control the code emitted. (It would be nice to avoid it if we can, though, and that comes back to my point about exploring the implementation space after pinning down in some detail the use cases and usage patterns.)
Why would that be required?
Currently the linear space is homogenous, but if we were to allow mapping/protection changes into linear memory that would no longer be the case. If we did spec memory operations for default memory, I would expect them to operate on page boundaries. This means that once adjacent pages can now be mapped/read-only pages. There is possibly a design space where we could declare upfront for some section of memory to be 'mappable', and then we wouldn't need to work at page granularity, but would that be sufficiently useful?
AFAICT this can already happen with a large memory.grow operation on most engines, which typically reserve (32-bit) memories and change protections upon grow. The underlying OS demand-pages these mappings and technically could go OOM on any memory access, even on pages that were previously mapped if it's swapped them to disk and memory is no longer available.
This is true, but there is a clear signal when to expect OOM on memory accesses, i.e. when a grow fails. The map
+ unmap
case is different though, that memory accesses that previously were successful, would fail after an unmap
, and if we were to allow mapping anywhere in the linear address space, that there can be an inaccessible chunk of memory in the middle of a JS ArrayBuffer seems to be too low level a detail to expose.
Aside from this, some other practical challenges would be
memory.buffer
getter and expose a different API That surfaces slices of the memory as different ArrayBuffers. Using multiple memories makes this somewhat easier if we could change protections at memory granularity for example. map
/unmap
calls with buffers that are backed by linear memory. In general, I'm not convinced that reserving a large chunk of memory upfront, but then providing functionality that can render chunks in the same space inaccessible is a robust approach. Certainly an attribute could be made to work to control the code emitted. (It would be nice to avoid it if we can, though, and that comes back to my point about exploring the implementation space after pinning down in some detail the use cases and usage patterns.)
I'm currently working on gathering usage patterns, and I agree that that would influence the implementation space the most.
Adding my take on this problem space after getting to chat with @lars-t-hansen a bit:
From my understanding of the shape of the necessary clang/llvm extensions that would allow C/C++/Rust to operate on non-default memories, I can only imagine it working on C/C++/Rust code that was carefully (re)written to use the new extensions -- I'm not aware of any automatic techniques for porting large swaths of code that isn't just the shared-nothing approach of the component model (where you copy at boundaries between separate modules which each individually use distinct single-memories; to wit, wasm-link polyfills module-linking+interface-types using multi-memory in exactly this manner). Thus, I think there's still a certain burden of proof to show that there is real demand for additional multi-memory-based features.
Independently, I think we can make great progress in the short-term improving the functionality of default linear memories. In particular, I can see each of the following 3 features allowing optimized implementations on most browsers today with a graceful fallback path when no OS/MMU support is available:
memory.discard
, as already discussed above. Graceful fallback to memset(0)
.memtype
declaring a trap-on-access low region of memory (either fixed to 1 wasm page or configurable), enabling reliable trap-on-null. In the absence of MMU support, an engine can implement this by simply performing an unsigned subtraction of the static size of the inaccessible region (such that wraparound causes the subsequent bounds check to fail). The corresponding JS API memory.buffer
ArrayBuffer
can be specified to alias only the accessible region (which does mean all pointer indices into it need to be offset... but I think that's probably the right tradeoff).File
, Blob
and ImageBitmap
on the Web platform). As a sketch: there could be a new bufferref
reference type (passed in from the host), along with buffer.map
and buffer.unmap
operations. Semantically, buffer.map
copies a subrange of a bufferref
into linear memory at a given offset, returning a handle to a new "mapping" of type mappingref
, and buffer.unmap
takes a mappingref
and zeroes out the previously-mapped region. The point is that buffer.map
can be implemented via mmap(MAP_FIXED|MAP_PRIVATE)
and buffer.unmap
via madvise(MADV_DONTNEED)
. The immutability is critical for ensuring copy semantics since mmap
is lazy. (Windows-knowing folks may worry about the absence of a MAP_FIXED
equivalent in VirtualAlloc
and the consequent race condition if buffer.map
performs VirtualFree
followed by MapViewOfFile
and another thread in the same process VirtualAlloc
s into the hole -- I bugged our Chakra colleagues about this relentlessly back in the day until they got the kernel team to add the PLACEHOLDER
flags to VirtualAlloc2
(available in Windows 10).)Lastly, outside of Core WebAssembly, but for completeness: to minimize copies of non-immutable-Blob
-like things, I think we should extend ReadableStreamBYOBReader
to additionally accept [Shared] Uint8Array
s that are not detached, but, rather, racily written into from host threads (as previously proposed). This would allow streams (produced by WebTransport
, WebCodec
, WebRTC
, ...) to quite efficiently emplace data into wasm linear memory. In theory, with this design, the one necessary copy from kernel space into user space can be used to write directly into linear memory. (Note that, anticipating this specialized use case of shared memory, while postMessage(SharedArrayBuffer)
is gated by COOP/COEP, new WebAssembly.Memory({shared:true})
is not, and thus this extension could be used unconditionally on the Web platform.) In a browser-independent setting, the Interface Types stream
type constructor we're iterating on should allow analogous optimizations, and bind to WHATWG streams in the JS API in terms of ReadableStreamBYOBReader
.
Together, I think these 4 features would address a decent amount of the use cases for mmap
/mprotect
without incurring the portability/safety challenges of the fully-general versions of these features in default linear memory or the adoption challenges with multi-memory.
Why not just map/unmap to the single linear memory, or memory(0)? ... At minimum, I expect that more memory accesses would need to be bounds checked, and write protections would also add extra overhead.
Why would there need to be any additional bounds checking? If a mapped region is overlaid on the linear memory, the wasm code could just use regular memory ops with the standard linear bounds checks.
Regarding the overhead, access protections would be handled by the VMM hardware. Given the the process is almost certainly already going to be operating through VMM translations there should be little to no performance impact.
Thank you for creating this proposal. This was a major problem in MVP.
Thanks @lukewagner for sketching this out, this is helpful.
Adding my take on this problem space after getting to chat with @lars-t-hansen a bit:
From my understanding of the shape of the necessary clang/llvm extensions that would allow C/C++/Rust to operate on non-default memories, I can only imagine it working on C/C++/Rust code that was carefully (re)written to use the new extensions -- I'm not aware of any automatic techniques for porting large swaths of code that isn't just the shared-nothing approach of the component model (where you copy at boundaries between separate modules which each individually use distinct single-memories; to wit, wasm-link polyfills module-linking+interface-types using multi-memory in exactly this manner). Thus, I think there's still a certain burden of proof to show that there is real demand for additional multi-memory-based features.
Independently, I think we can make great progress in the short-term improving the functionality of default linear memories. In particular, I can see each of the following 3 features allowing optimized implementations on most browsers today with a graceful fallback path when no OS/MMU support is available:
memory.discard
, as already discussed above. Graceful fallback tomemset(0)
.- A new optional immediate on
memtype
declaring a trap-on-access low region of memory (either fixed to 1 wasm page or configurable), enabling reliable trap-on-null. In the absence of MMU support, an engine can implement this by simply performing an unsigned subtraction of the static size of the inaccessible region (such that wraparound causes the subsequent bounds check to fail). The corresponding JS APImemory.buffer
ArrayBuffer
can be specified to alias only the accessible region (which does mean all pointer indices into it need to be offset... but I think that's probably the right tradeoff).- A new set of primitives to enable Copy-On-Write mapping of immutable byte-buffers (such as
File
,Blob
andImageBitmap
on the Web platform). As a sketch: there could be a newbufferref
reference type (passed in from the host), along withbuffer.map
andbuffer.unmap
operations. Semantically,buffer.map
copies a subrange of abufferref
into linear memory at a given offset, returning a handle to a new "mapping" of typemappingref
, andbuffer.unmap
takes amappingref
and zeroes out the previously-mapped region. The point is thatbuffer.map
can be implemented viammap(MAP_FIXED|MAP_PRIVATE)
andbuffer.unmap
viamadvise(MADV_DONTNEED)
. The immutability is critical for ensuring copy semantics sincemmap
is lazy. (Windows-knowing folks may worry about the absence of aMAP_FIXED
equivalent inVirtualAlloc
and the consequent race condition ifbuffer.map
performsVirtualFree
followed byMapViewOfFile
and another thread in the same processVirtualAlloc
s into the hole -- I bugged our Chakra colleagues about this relentlessly back in the day until they got the kernel team to add thePLACEHOLDER
flags toVirtualAlloc2
(available in Windows 10).)
Could you elaborate on how multiple mappings would work? I'm also thinking about what would happen when after unmapping one external buffer, but a different buffer now needs to be mapped in. One of the concerns I had was depending on the sizes of the buffers that we need, if unmapping makes regions of the existing memory inaccessible, then subsequent buffer.unmap
operations leaves larger and larger chunks of the linear memory inaccessible. Or does this approach sidestep that problem by using madvise(MADV_DONTNEED)
because the memory is not then inaccessible by default?
Lastly, outside of Core WebAssembly, but for completeness: to minimize copies of non-immutable-
Blob
-like things, I think we should extendReadableStreamBYOBReader
to additionally accept[Shared] Uint8Array
s that are not detached, but, rather, racily written into from host threads (as previously proposed). This would allow streams (produced byWebTransport
,WebCodec
,WebRTC
, ...) to quite efficiently emplace data into wasm linear memory. In theory, with this design, the one necessary copy from kernel space into user space can be used to write directly into linear memory. (Note that, anticipating this specialized use case of shared memory, whilepostMessage(SharedArrayBuffer)
is gated by COOP/COEP,new WebAssembly.Memory({shared:true})
is not, and thus this extension could be used unconditionally on the Web platform.) In a browser-independent setting, the Interface Typesstream
type constructor we're iterating on should allow analogous optimizations, and bind to WHATWG streams in the JS API in terms ofReadableStreamBYOBReader
.
More of an update here, my original concern with this was that not all of the use cases that this proposal is intending to target use streams, I'm currently still working on the subset of workloads that this proposal should handle well. That is still WIP, and will report back here when I have more to share.
Why would there need to be any additional bounds checking? If a mapped region is overlaid on the linear memory, the wasm code could just use regular memory ops with the standard linear bounds checks.
@mykmartin - Several Wasm engines have optimization strategies for getting rid of the standard linear bounds checks, using guard pages for example removes the need for the linear bounds checks under the assumption that the memory is owned by Wasm.
Several Wasm engines have optimization strategies for getting rid of the standard linear bounds checks, using guard pages for example removes the need for the linear bounds checks under the assumption that the memory is owned by Wasm.
Ok, but how does a given region of the linear buffer being mapped onto affect that? From the wasm code's point of view, it's still just a regular lookup in the standard address space.
Several Wasm engines have optimization strategies for getting rid of the standard linear bounds checks, using guard pages for example removes the need for the linear bounds checks under the assumption that the memory is owned by Wasm.
Ok, but how does a given region of the linear buffer being mapped onto affect that? From the wasm code's point of view, it's still just a regular lookup in the standard address space.
Sorry, I'm not sure how I missed this last question. To me this is different in a couple of different ways:
@dtig Awesome to hear about your stream WIP and I'm interested to hear more.
Could you elaborate on how multiple mappings would work? I'm also thinking about what would happen when after unmapping one external buffer, but a different buffer now needs to be mapped in. One of the concerns I had was depending on the sizes of the buffers that we need, if unmapping makes regions of the existing memory inaccessible, then subsequent buffer.unmap operations leaves larger and larger chunks of the linear memory inaccessible. Or does this approach sidestep that problem by using madvise(MADV_DONTNEED) because the memory is not then inaccessible by default?
Yup! You're correct in your final sentence: since ultimately buffer.map
and buffer.unmap
have copy semantics, they always leave all linear memory accessible and, after a buffer.unmap
, zeroed. The real goal of the madvise(DONTNEED)
, though, is to efficiently remove any dependency from the virtual memory on the previously-mapped file descriptor so it can be released or mutated.
Has anyone considered marking a region of memory as volitile so statically compiled WebAssembly modules could implement memory-mapped I/O for device drivers?
Ok, but how does a given region of the linear buffer being mapped onto affect that? From the wasm code's point of view, it's still just a regular lookup in the standard address space.
Sure from the WASM code, it's just a memory access.
But that's not where the bounds check will be, the WASM implementation now needs to (in the worst case) check each memory access to see in which mapping it happens and create the correct pointer offset from the WASM memory offset and handle when an unaligned memory access straddles a boundary.
Also from the implementation side, very few memory mapping apis (I'm thinking of opengl's glMapBuffer and vulkan's vkMapMemory) let the user code (read: the wasm implementation) pick where the mapping happens, this means that when a map is requested by WASM code the implementation cannot simply tell the OS kernel to map that into the memory of the wasm module because the API doesn't let it.
Moreover those mapping boundaries are dynamic. So you cannot on module load inspect the module and find all the boundaries to create a perfect hash.
All this culminates in a pretty significant pessimization for a the most hot part of a WASM implementation, the memory access.
A drive-by comment: memory.discard
and a declarative read-only low region of memory makes a lot of sense to me.
However when it comes to getting C/C++ programs to emit reads and writes from non-default memory, this is going to be as invasive both to programs and to the toolchain as natively accessing packed GC arrays. So perhaps we should focus effort there. GC will arrive soonish (right? lies we tell ourselves, anyway) and so maybe this feature isn't needed, as such.
It sure would be nice to solve this use case without exposing the mmap
capability to users.
However when it comes to getting C/C++ programs to emit reads and writes from non-default memory, this is going to be as invasive both to programs and to the toolchain as natively accessing packed GC arrays. So perhaps we should focus effort there. GC will arrive soonish (right? lies we tell ourselves, anyway) and so maybe this feature isn't needed, as such.
Do we have a separate thread somewhere about accessing GC packed arrays from C++? I think it has potential, though feasibility of the toolchain change is probably the main question.
@penzn Not sure if there is a thread, and though it's important for tying together all parts of the system, it's probably out of scope for wasm standardization. Anyway I just wrote up some thoughts here, for my understanding of the current state of things: https://wingolog.org/archives/2022/08/23/accessing-webassembly-reference-typed-arrays-from-c
mmap
. mmap
, etc. should not have an impact on code that does not use such features (e.g. already existing wasm code).0xff123456
to refer to the special "virtual address area" when using the usual WASM load/store instructions. 2**32 - vma_size
. If no virtual address area is requested, the wasm compiler can completely optimise out any VMA checks, as is the case with 'legacy' code. Runtimes might want to limit the maximum size that can be reserved as VMA (e.g. MSB must alway be set, etc.) .WebAssembly.instantiate
will return a VMA-handle w/ methods in addition to the well known Memory
-handle/object.mmap
into the existing wasm infrastructure. Ideally those future operations would use virtual memory mappings facilities of the operating system / hardware (which might limit the possible start/end-addresses of such regions to page-aligned addresses...). The user is basically free in how and where he places regions inside the VMA and what they should contain.Blob
or (Shared-?)ArrayBuffer
(or an aligned subarray of it?) into the region from address 0xfffe000
to 0xfffefff
inside the VMA. The user later choose to remap the region to show 0x1000 bytes starting from the 0x1000-th byte of the Blob/.... Some third-party applications already use negative addresses to flag memory areas as needing reversed byte order such as big endian processors. If two different uses of negative addresses that will conflict. Also, the WebAssembly standard is officially little endian so it is unlikely that endian swapping will get official support any other way. This is according to w2c2 documentation and I think the wasm2native compiler (the third-party big endian supporters).
Some third-party applications already use negative addresses to flag memory areas as needing reversed byte order such as big endian processors. If two different uses of negative addresses that will conflict. Also, the WebAssembly standard is officially little endian so it is unlikely that endian swapping will get official support any other way. This is according to w2c2 documentation and I think the wasm2native compiler (the third-party big endian supporters).
Their reversed memory addressing (mem[size - ptr]
instead of mem[ptr]
) in the wasm compiler does not affect what shall happen when ptr < 0
, resp. ptr > 0xff123456
(e.g.) is accessed from a wasm instruction (load/store).
Oh ok. Thanks for clarifying.
bump. this would be a massive leap in the ability to port existing libraries and applications to wasm, as well as generally increase memory efficiency for large wasm programs that make good use of the heap
Trying to follow the discussions it seems the focus is on option 1 and a static number of memories. Why is it bad to allow dynamic creation/deletion of linear memories?
With the multi-memory proposal having the option to create and delete memories via instructions at runtime could in my opinion solve many problems mentioned in https://github.com/WebAssembly/design/issues/1397
memory.create memargs
... creates a new linear memory with the given memargs and returns the index to address this new memory. Traps/Errors if it is not possible to create this memory (kind of like memory.grow)
memory.delete memidx
... "deletes" the memory addressed with the given index. Fails if index=0 to prevent from deleting the default memory. Noop if the index does not point to a created memory.
Embeddings may choose to keep the allocated memory for later memory.create
calls.
memory.mem_copy idx1 address1 len1 idx2 address2 len2
... copies memory from one linear memory to another. Fails if either index is not valid or addresses+len are not in memory range.
Not happy about the instruction name, but I could not come up with a better one for now.
memory.map_extern externref memargs
... maps an extern memory using the given memargs
returning the memidx to access the memory
This could be used for example to exchange/share large blobs like media files between host and wasm.
However, I am not sure how to ensure that the externref
actually points to a WasmMemory
with the given memargs
.
As mentioned in issue #1397, applications often allocate a bunch of memory that is intended to be deleted again after a short period. If memories could be created and deleted during runtime, this would prevent fragmentation of other longer-lived memories.
This would be especially helpful for shared memories, because these must specify a maximum at creation. Knowing a maximum upfront is really difficult for most applications, but because of shared access, I get why a maximum must be set.
Side note on read-only memory:
With multiple-memories, one of the static memories could be marked as read-only.
This memory only takes values from the data-segment, but does not allow store
instructions.
I am not sure on how useful it is to allow the creation of read-only memories,
because this memory won't grow anyways.
Also changing between read-only and writeable at runtime seems unnecessary, because this must be restricted by the language compiling to wasm. Otherwise, one could always change the mode as needed, making it an inconvenience, but not a security feature.
Because memories at least have a size of one page, It is inefficient to create one memory per dynamic object e.g. vector. Which results in manual memory management per linear memory. One would probably need some fixed address in a default linear memory that points to the tree of free memory blocks in each dynamically created linear memory. The good thing here is that the tree itself may be located in a dynamically created linear memory, because it grew over time, exceeding the maximum of the default linear memory.
create
could always bump the index by 1 and because memidx is i32,
this should even be sufficient for applications not running multiple years.
Alternatively, create
returns the index of the last deleted memory,
or increases the index by 1 if no memory was deleted yet.
There are definitely problems that explain why it was decided against dynamic memory creation and deletion. Happy to hear your feedback.
I think @lukewagner is on to something there, but I feel like it has never been fully articulated in this conversation.
What if an instances linear memory being (memory 0)
is a mistake, that forces the entire spec down a garden path towards load/store addressable additional memories .
There already is an existing solution on how to integrate multiple memories with different read/write capabilities in a flexible manner: mmap
mmap
ing even allows for the construction of things like virtual ring-buffers, where writers can write past the end of a doubly mmap
ed file to simplify the wrapping logic, and is therefore a strict superset of what the current multi-memory proposal is capable of.
Such a solution might look like:
This would give us the best of both worlds. Decoupling of multiple memories on both the host and the wasm side (the wasm instance can not only ignore additional memories offered by the host, but also has a lot of control of when and where things get moved around, e.g. when one of the memories changes in size and the wasm instance decides to either ignore that scenario, potentially re-map other memories, move other allocations around, or potentially even create non-contiguous mappings)
This would also align well with existing mmap
semantics which would help WASI match existing applications' requirements, with the memory index essentially being file descriptors of host provided mmap
able files.
Semantically the mmap
would just be a copy of the memory source region into the memory target region with an unmap
equivalent to zeroing the range as proposed by @lukewagner.
I think @lukewagner is on to something there, but I feel like it has never been fully articulated in this conversation.
What if an instances linear memory being
(memory 0)
is a mistake, that forces the entire spec down a garden path towards load/store addressable additional memories .There already is an existing solution on how to integrate multiple memories with different read/write capabilities in a flexible manner:
mmap
mmap
ing even allows for the construction of things like virtual ring-buffers, where writers can write past the end of a doublymmap
ed file to simplify the wrapping logic, and is therefore a strict superset of what the current multi-memory proposal is capable of.Such a solution might look like:
A single anonymous linear memory that is not accessible from outside the WASM instance.
Current programming language compatible load/store instructions that only operate over the anonymous linear memory.
A memory index similar to what's in the multi-memory proposal, with different read/write capabilities and ArrayBuffer/Blob/whatever sources.
EXPLICIT mmap operations that map page ranges from the memories in the memory index onto page ranges in the anonymous linear memory.
This would give us the best of both worlds. Decoupling of multiple memories on both the host and the wasm side (the wasm instance can not only ignore additional memories offered by the host, but also has a lot of control of when and where things get moved around, e.g. when one of the memories changes in size and the wasm instance decides to either ignore that scenario, potentially re-map other memories, move other allocations around, or potentially even create non-contiguous mappings)
This would also align well with existing
mmap
semantics which would help WASI match existing applications' requirements, with the memory index essentially being file descriptors of host providedmmap
able files.Semantically the
mmap
would just be a copy of the memory source region into the memory target region with anunmap
equivalent to zeroing the range as proposed by @lukewagner.
I counter this proposal, wasm does not need fine-grained permission control, think about it:
I think an mmap-like function would be great for wasm but only in the sense of making memory non-linear. The ability to mmap files is severely limited by the address space (32 bits)
- Nonreadable memory makes no practical sense
I think it does, but only in the sense of making a hole in the address space that holds no data and traps when read or written, e.g. to catch null pointer dereferences in C-style languages.
I think an mmap-like function would be great for wasm but only in the sense of making memory non-linear. The ability to mmap files is severely limited by the address space (32 bits)
memory64 to the rescue!
wasm does not need fine-grained permission control
My proposal is not about naively stuffing native mmap
into WASM, but about aligning the multiple memory proposal (which I think is a good feature and a potential solution for many real world issues) with the reality of our programming languages not being able to deal with a completely alien memory model where individual load and store instructions have fine grained memory contexts.
In a sense I am arguing against a very fine grained permission model too.
I think that read-only memory is important for security in addition to consistency when working with mmap
ed file IO, where you want to get read-only access to a zero-copy blob from wasm, or you want to be able to write to a network buffer from wasm which you then mark read-only/unmap to pass ownership of the memory to the host/network stack.
But I think it is even more important to decouple the semantics of multiple memories from the semantics of individual load/store instructions, and a mechanism that allows us to do so, and that has been tried and tested is mmap
(in this context I don't mean the specific implementation, but the concept of using the MMU to map certain memory/file/buffer ranges onto other virtual memory ranges).
The host context provides buffers (memories), the wasm context is given explicit control if and how to mmap those buffers into its linear memory.
It would give existing languages explicit control over how they want to deal with multiple memories (including the option to ignore it, with the host potentially performing a single mapping of (memory 0)
onto the anonymous linear memory to recover the current semantics), and would enable any language that has the ability to do mmap
ed IO to immediately start using the multi memory
feature.
Just a clarifying remark: multiple memories always existed in Wasm, since version 1.0: by linking two modules together that both define their own memory you always had multiple unrelated address spaces.
The only limitation that is finally lifted by the multi-memory extension is the (weird) restriction that a single module was not able to speak about multiple memories. That caused various problems, for example, the inability to statically link (merge) arbitrary modules, or the inability to efficiently move data between such memories. There are other use cases for multiple memories, too, that don't require exposing them to a source language, for example, instrumentation or poly-filling other features.
FWIW, I'm in favor of adding some degree of support for read-only memory and no-access memory, if only to allow us to more-simply claim that wasm is an entirely more secure way to run code.
We already get a huge mitigative security boost from our protected stack and CFI but the fact that *(void*)NULL
doesn't trap and that we can't prevent mutation of .rodata
unnecessarily hurts the simplicity of our claim and forces more nuanced arguments of pros-vs-cons (or "fine-grained sandboxing is what you actually want"). So I'm definitely in favor of closing these gaps; the only problem is that, especially now that wasm is showing up in all sorts of diverse contexts (in production, in high volume), we can't always assume we have an MMU or that we're given access to it. MPUs (Memory Protection Units) are, it sounds like, becoming a reasonable assumption to make even in embedded hardware, but they only support coarse-grained protection (a small constant number of regions with different protection).
As mentioned above and brainstormed further by @eqrion more recently, if we had a very coarse-grained protection model (M no-access pages starting at 0
followed by N read-only pages followed by all read-write pages up to memory.length
, with M and N declared in the memtype
), we could implement the semantics with an MPU or without any hardware assistance at low overhead. So I like that idea.
The only mitigative use case I'm aware of that this doesn't solve is linear-memory-stack guard pages, which would seem to still need fine-grained protection. But maybe stack-canaries implemented for wasm by LLVM are enough?
Adding a note here that the work on this proposal has now moved to the memory control proposal repository, which reflects the current work. Feedback/issues on the proposal repository are appreciated so we can discuss them in more detail. Looking at the proposal repository, you may notice that there are several possible directions, though given how diverse the ecosystem is and the current restrictions of production VMs, we don't yet have consensus on exactly how we'll be tackling this.
As mentioned above and brainstormed further by @eqrion more recently, if we had a very coarse-grained protection model (M no-access pages starting at 0 followed by N read-only pages followed by all read-write pages up to memory.length, with M and N declared in the memtype), we could implement the semantics with an MPU or without any hardware assistance at low overhead. So I like that idea.
I assume this is the sketch in static-protection.md? I like this idea too, my concern is that if this was to be fully static it would be hard for runtimes to motivate a fundamental memory layout change without some runtime control of the read-only section.
Thanks for the links to the more recent proposal repository, it looks like there are already some similar Ideas articulated there!
Regarding the static protection proposal for MPUs, it feels like feature creep for WebAssembly to also try to also become the universal IR for embedded systems. Introducing the difficulties of embedded programming to the web ecosystem, seems like an unnecessarily masochistic restriction, when even smaller cores slowly move into the direction of having virtual memory.
Giving up fine-grained r/w-control and memory mapping APIs in return, which have real security, reliability and performance applications seems like a bad deal for everyone except embedded developers. And I say that with a somewhat large distain for the complexity of MMUs and the memory stack of modern OSes.
It is OK to have different technologies that solve their use case well, and if WASM wants to make a dent into the native application space it needs to have equal or better capabilities and guarantees, and the lowest common denominator with embedded hardware won't fit that bill. Embedded folks will also probably be happier if they get their own specific thing/spec and don't have to foot the bill for high-level stuff like GC.
Edit: Embedded systems could also simply fail and abort when they get a memory mapping request that's not compliant with their MPU layout:
const noaccess = new WebAssembly.Memory({ initial: 1, mode: "n"});
const readonly = new WebAssembly.Memory({ initial: 10, mode: "r"});
const readwrite = new WebAssembly.Memory({ initial: 100, mode: "rw"});
...
js: { nomem: noaccess, rmem: readonly, mem: readwrite}
(module
(mmap (import "js" "nomem") 0 1)
(mmap (import "js" "rmem") 1 10)
(mmap (import "js" "mem") 11 100)
(mmap (import "js" "rmem") 111 5) // this would panic on embedded
...
@somethingelseentirely
Giving up fine-grained r/w-control and memory mapping APIs in return, which have real security, reliability and performance applications seems like a bad deal for everyone except embedded developers.
I have some sympathy for this in that there are a lot of powerful features that can offer real value to applications and are already in wide use. There is a lot more diversity in system APIs and capabilities, comparatively speaking, than hardware ISAs. Constantly falling short of feature parity compared to native platforms or APIs limits Wasm's ability to add value to ecosystems. Limiting everything to the least common denominator will eventually cause the least capable platform to dictate that more capable platforms can't exist. So we'll need to manage ecosystem diversity in some way.
That said though, WebAssembly has threaded this needle by deftly picking MVP features that get the main value-add of a feature without unduly burdening implementations. What @lukewagner mentions, refering to work by @eqrion to make a simplified model that gets effectively MPROTECT_NONE
and MPROTECT_READONLY
could get enough of that feature in a forward-compatible way so that the Wasm security story is at least at (some simplified level of) parity with native.
Very little of this thread has been dedicated to possible malloc
implementations, after all I assume part of the motivation behind better memory control is to reduce memory fragmentation.
How well would having multiple memories solve this problem compared to, say, an mmap
-based approach?
The linear memory associated with a WebAssembly instance is a contiguous, byte addressable range of memory. In the MVP each module or instance can only have one memory associated with it, this memory at index zero is the default memory.
The need for finer grained control of memory has been in the cards since the early days of WebAssembly, and some functionality is also described in the future features document.
Motivation
Proposed changes
At a high level, this proposal aims to introduce the functionality of the instructions below:
memory.map
: Provide the functionality ofmmap(addr, length, PROT_READ|PROT_WRITE, MAP_FIXED, fd)
on POSIX, andMapViewOfFile
on Windows with accessFILE_MAP_READ/FILE_MAP_WRITE
.memory.unmap
: Provide the functionality of POSIXmunmap(addr, length)
, andUnmapViewOfFile(lpBaseAddress)
on Windows.memory.protect
: Provide the functionality ofmprotect
withPROT_READ/PROT_WRITE
permissions, andVirtualProtect
on Windows with memory protection constantsPAGE_READONLY
andPAGE_READWRITE
.memory.discard
: Provide the functionality ofmadvise(MADV_DONTNEED)
andVirtualFree(MEM_DECOMMIT);VirtualAlloc(MEM_COMMIT)
on windows.Some options for next steps are outlined below, the instruction semantics will depend on the option. The intent is to pick the option that introduces the least overhead of mapping external memory into the Wasm memory space. Both the options below below assume that additional memories apart from the default memory will be available. The current proposal will only introduce
memory.discard
to work on the default memory, the other three instructions will only operate on memory not at index zero.Option 1: Statically declared memories, with bind/unbind APIs (preferred)
memory.map
/memory.unmap
underneath. (Note: it may be possible for some browser engines to operate on the same backing store without an explicitmap
/unmap
instruction. If the only usecase for these instructions is from JS, it is possible to make these API only as needed.)memtype
to store memory protections in addition to limits for size ranges.Reasons for preferring this approach:
Option 2: First class WebAssembly memories
This is the more elegant approach to dynamically add memories, but adding support for first class memories is non-trivial.
ref.mem
.memarg
to use memory references.Other alternatives
Why not just map/unmap to the single linear memory, or memory(0)?
Web API extensions
To support WebAssembly owning the memory, and also achieving zero copy data transfer, is to extend Web APIs to take typed array views as input parameters into which outputs are written. The advantage here is that the set of APIs that need this can be scaled incrementally with time, and it minimizes the changes to the WebAssembly spec.
The disadvantages are that this would require changes to multiple Web APIs across different standards organizations, it’s not clear that the churn here would result in providing a better data transfer story as some APIs will still need to copy out.
This is summarizing a discussion from the previous issue in which this approach was discussed in more detail.
Using GC Arrays
Though the proposal is still in phase 1, it is very probable that ArrayBuffers will be passed back and forth between JS/Wasm. Currently this proposal is not making assumptions about functionality that is not already available, and when available will evaluate what overhead it introduces with benchmarks. If at that time the mapping functionality is provided by the GC proposal without much overhead, and it makes sense to introduce a dependency on the GC proposal, this proposal will be scoped to the remaining functionality outlined above.
JS API
Interaction of this proposal with JS is somewhat tricky because
Open questions
Consistent implementation across platforms
The functions provided above only include Windows 8+ details. Chrome still supports Windows 7 for critical security issues, but only until Jan 2023, this proposal for now will only focus on windows system calls available on Windows 8+ for now. Any considerations of older Windows users will depend on usage stats of the interested engines.
How would this work in the tools?
While dynamically adding/removing memories is a key use case, for C/C++/Rust programs operate in a single address space, and library code assumes that it has full access to the single address space, and can access any memory. With multiple memories, we are introducing separate address spaces so it’s not clear what overhead we would be introducing.
Similarly, read-only memory is not easy to differentiate in the current model when all the data is in a single read-write memory.
How does this work in the presence of multiple threads?
In applications that use multiple threads, what calls are guaranteed to be atomic? On the JS side, what guarantees can we provide for Typed array views?
Feedback requested
All feedback is welcome, but specific feedback that I would find useful for this issue:
Repository link here if filing issues is more convenient.