Closed jfbastien closed 8 years ago
asm.js is passed an array buffer at creation time, and that buffer isn't growable. There was discussion of allowing resizing, but that's efficient on all engines so isn't supported.
The two engines which implement asm.js support this. And I just added ArrayBuffer.transfer, which Chrome/FF already have, so if Safari joins the mix (@pizlonator) it seems like this should be fine (and it was pretty easy to add, so I don't see much motivation against it).
In any case, I was envisioning that a wasm module requests a CommitHeapSize and a ReserveHeapSize, which would be a contiguous address space (i.e. VirtualAlloc(null, ReserveHeapSize, MEM_RESERVE); VirtualAlloc(baseAddress, CommitHeapSize, MEM_COMMIT);
). I think being a page-size multiple is a useful feature, and 64k is apparently an option for page size on arm64, so that might be what we want (though it seems a little big to me).
What's currently in the design [1] is that a heap is private state of a module, created when the module is first loaded. Not explicitly documented yet is it that the "memory initialization section" [2] would be the place to declare the initial size (and initial state, analogous to what native binaries do with .data
, .bss
etc).
The two problems I see with mmap
(as opposed to sbrk
) is that (1) it would be a major source of non-determinism if we allow mmap
to choose the base address (and if not, what's the point), (2) I don't see an efficient impl that allows a browser to address the contiguous-region-fragmentation problem: if you put wasm memory in two separate regions, you still have one index that needs to somehow disallow access to the intervening region and VirtualProtect
/mprotect
ing on every call/return doesn't seem viable.
I don't really understand the proposal to let the module allocate its own memory since it would mean starting execution w/o a heap (so no load
/store
) which seems like a weird special mode that we'd have to implement. I also don't see the increase in expressiveness compared to just declaring an initial heap size and allowing sbrk
at runtime.
It looks like the current docs mention sbrk
as being present in the MVP. Should we define more precisely how that would work? I assume the idea is a posix-style "get a delta, return the previous absolute"? I can open a PR if that sounds ok.
Also not clear to me in the docs is where sbrk
would come from. Would it be imported from somewhere, or would it be an opcode?
Yes, I think it would be good to clarify after reaching agreement here.
I'd lean towards a limited developer-side mmap
-ish API that toolchains happen to implement by default (and where implementation may round allocations up).
Fixed size at load time seems limiting, and so does MEM_RESERVE
.
Developers don't choose where the heap is, they just choose its size (since wasm implies a hidden base).
What functionality are you proposing it would have that is more powerful than sbrk
? (I'm curious where you fall in between sbrk
and full POSIX mmap
.)
For anything non-trivial, perhaps it makes sense to stick to sbrk
for the MVP, and leave more sophisticated things for the future?
I'm not sure I'd initially give it that much more capabilities:
addr
should be NULL
, though that's a bit moot since we're not necessarily exposing the actual virtual address.length
could be restricted (min/max values) and have to be page-sized (see sysconf
below).prot
other than PROT_READ | PROT_WRITE
can be rejected for now, although PROT_NONE
makes sense too (see below).flags
should probably be MAP_ANONYMOUS
. I'd like to allow specifying MAP_NORESERVE
or MAP_POPULATE
, since they're useful in different applications. I'm not sure we should accept anything else.fd
should be -1
.offset
should be 0
.munmap
should probably not be available for MVP, and added later.
We should probably offer sysconf(_SC_PAGE_SIZE)
or something analogous.
It would however be nice if the low-address page could be mapped as PROT_NONE
from user-side code, so there's no magical "you can't address these low bits" implied in the format. That would mean that addr
can be passed in, but has to be the previous mmap
location plus its length
.
That allows us to:
sbrk
.As far as I can tell, the additional power you are proposing over sbrk
is the allocation types (MAP_NORESERVE
, etc.)?
I didn't follow the part about removing magic. Are you saying you see a problem with a wasm application accessing HEAP[0]
and you want that to be avoidable by wasm content itself?
As soon as we start sharing heaps between modules or doing similar things, we'll want the ability to use mmap
to specify how the address spaces interact.
Creating deadzones with PROT_NONE
is a pretty compelling feature for debugging and robustness as well. In the long run people may want copy-on-write or read-only pages so having mmap exposed at some level early is valuable in the long run.
@kripken: "magic" refers to the PROT_NONE
section (currently ill-defined), as well as how the heap gets mapped / what its alignment and size are, ...
That seems to add magic - I still don't follow what you mean by "remove some magic" earlier? What is the current magic that this removes?
The current suggestion automagically sets some PROT_NONE
memory area, unspecified what size it is and how that happens. mmap
allows developer code to PROT_NONE
anything they want, no magic.
I see, thanks, that's the part at paragraph 3 here, I now see.
Do we agree that sbrk
is enough for the MVP? Or are you also proposing that these features (that I agree we want eventually) are urgent? They don't seem polyfillable to JS so I was assuming they were non-MVP.
I think sbrk
might be enough but I kind of fundamentally object to it as our baseline memory model. I think mmap
is the right foundation for address/memory management. Worse is better, though, so maybe sbrk
is the right kind of worse :-)
Ok, it looks like most of us agree that mmap is the better option, so perhaps there just isn't a reason to do sbrk
as a short-term thing. This suggests that we
sbrk
from the MVP; wasm only supports a fixed memory size in the first early iteration.mmap
to PostMVP, with similar features as @jfbastien suggested.mmap
capabilities (of files) as well as madvise
.How does that sound?
@kripken sgtm :-)
mmap
can be made restrictive enough to be the same as sbrk
, while removing some magic and making things simpler once we do expand mmap
further.
Opened #285.
Sorry to be late to the party.
On Jun 24, 2015, at 4:28 PM, Michael Holman notifications@github.com wrote:
asm.js is passed an array buffer at creation time, and that buffer isn't growable. There was discussion of allowing resizing, but that's efficient on all engines so isn't supported.
The two engines which implement asm.js support this. And I just added ArrayBuffer.transfer, which Chrome/FF already have http://kangax.github.io/compat-table/es7/#ArrayBuffer.transfer, so if Safari joins the mix (@pizlonator https://github.com/pizlonator) it seems like this should be fine (and it was pretty easy to add, so I don't see much motivation against it).
I object to WebKit doing this. We don’t currently have a plan to “support” asm.js in the sense of recognizing “use asm”, and there is no good way to reconcile our plain-JS singleton-based constant inference and reassigning the HEAP variables.
In any case, I was envisioning that a wasm module requests a CommitHeapSize and a ReserveHeapSize, which would be a contiguous address space. I think being a page-size multiple is a useful feature, and 64k is apparently an option for page size on arm64, so that might be what we want (though it seems a little big to me).
— Reply to this email directly or view it on GitHub https://github.com/WebAssembly/design/issues/227#issuecomment-115044857.
All the talk of memory reminds me that we haven't discussed how WebAssembly modules get their memory.
Current implementations:
WebAssembly, at least initially, shares its virtual memory space with other parts of the browser, which means that over-allocation will lead to fragmentation and potentially virtual memory exhaustion. This is a problem e.g. on 32-bit Windows XP systems which are still pretty big usecases.
Allocating physical memory lazily also means that an application can fault at runtime for any read/write that touches memory that was never used and now needs to be allocated. I think it's a desirable feature, but without signal handling it's kind of hard to handle!
Also, what kind of alignment and power-of-two size guarantees do we make, if any?
I think it would be great to support
mmap
, and on_start
just allocate the heap with some restricted flags. We can decide to restrict what can be done initially (don't be lazy, allocate all physical memory, don't allow reallocation), and loosen these restrictions later. This is similar to passing in memory from the embedder, but can be made more powerful later while still being polyfillable (the polyfill can behave the same as asm.js does).