Open aardappel opened 4 years ago
The maximum
field of a linear memory addresses is also related. The idea is that it can be impractical for VMs to move linear memory within the host address space to service a memory.grow
if there are other threads running and using the memory, so the maximum
field is meant to allow implementations to reserve the maximum address space up front, and then allocate actual memory within that address space as needed.
Another thing to consider is that, even with the memory64 proposal as-is, when you memory.grow
, until the newly allocated wasm memoryis touched by wasm load/store instructions, the OS is usually not assigning any physical pages. The only thing eagerly allocated in the kernel is some sort of descriptor that claims the virtual address range. Thus, you roughly get the performance characteristics you're asking for out-of-the-box.
@lukewagner that really depends on how it is implemented, and on the OS.
For example, on Windows, your lowest level allocator would typically acquire memory from the OS with VirtualAlloc
and MEM_COMMIT
set (since without that flag, you don't get the "allocate physical memory on access" behavior, but a page fault). Committing however, means you are allocating address space that is capped by the total amount of physical memory + page file size, shared by all processes, so this is very limited. This does not allow "address space reservation" the same way as it can be done on Linux, unless explicit guard pages are set (which assume linear increasing access).
Now that I write this, I am actually doubting the functionality I am suggesting is even possible on Windows.. unless we specify that memory.reserve
memory must be accessed linearly to commit, which is probably too "hacky" a feature to be part of Wasm. So unless anyone has any bright ideas, we can close this.
Though I guess making it part of the spec somehow that "new memory made available thru memory.grow can reasonably be expected to not use physical pages until accessed" would leave slightly more room for people to decide to ask for the maximum memory they may need with no performance consequences.
I am guessing the maximum
@sunfishcode refers to would typically be implemented by VMs with MEM_RESERVE
on Windows (address space allocated, but page faults if accessed) and MAP_NORESERVE
on Linux (doesn't page fault, but no page file backing, so may SIGSEGV
on access if that runs out).
Updated the original post, realizing there's a 3rd reason for this being efficient: being able to omit check on pointer bumps.
And I think its useful beyond programming language runtimes, in e.g. databases.
@aardappel Ah hah, good point; I forgot that Windows didn't have overcommit. In that case, then to achieve your bullet one option (which seems like the most attractive option to me), a wasm impl either needs an OS with overcommit or an OS that provides the ability to:
VirtualAlloc(MEM_RESERVE)
)AddVectoredExceptionHandler
)VirtualAlloc(MEM_COMMIT)
)One thing I haven't seen discussed in general is whether we assume the memory64 feature will be available on all hosts (even 32-bit ones, where memory64 would be fairly slow due to it being forced to use emulated i64
s everywhere), or whether it's a permanently-optional feature. If it's permanently optional, then it might not be bad for memory64 to have the above OS requirement, since not all hosts need to support memory64 anyway.
@lukewagner yup, that's pretty much the only way it would work on Windows.
I was actually curious how it would work on Windows and couldn't find a clear example online, so created a prototype that would work thru a page fault handler myself: https://github.com/aardappel/stackalloc/blob/master/stackalloc.cpp And yes, it works rather well.. it was able to reserve and access multiple 64GB memory blocks at no apparent cost.
Actually difference is that this code uses PAGE_GUARD
(which requires commit) to catch new pages being accessed, and you are suggesting directly catching access to reserved-only pages. I can try and see if that works also.
As for wether 32-bit hardware platforms should support memory64:
size_t
and friends), and 32-bit support may "atrophy"?Ok, verified that committing without guard pages also works on Windows, which allows random access to the reserved address space.
So yes, this would allow an engine to use this for memory.grow
, and allow that to grow to sizes significantly bigger than physical / page file memory at no cost. Upon exception, you can choose how much memory to commit beyond the page that was actually touched, so the cost of this incremental committing can be kept arbitrarily low too.
I wonder if the 2nd option (adding explicit memory.reserve
) is better here. The problem I can see with the 1st option is that it means that loads/stores may trap, which can't currently. If we add memory.reserve
, then we know that the user has opted in to this behavior.
Just noting that this feature, if there is interest, is intended for post-MVP.
Just noting that this feature, if there is interest, is intended for post-MVP.
(meta-comment: maybe create a "post-mvp" label, and add it for issues like these?)
With a 64-bit address space, we don't just get to break the 4GB barrier, but potentially new programming techniques become available, in particular techniques that rely on using larger parts of the address space, i.e. the ability to reserve memory that does not cause physical memory to be used unless touched (possible on native platforms with
mmap
orVirtualAlloc
).This is particularly useful in working with large data sets, being able to reserve large arrays, and being able to work with them and:
realloc
/memcpy
as it grows.This can be particularly useful for implementing programming languages, that like to manage the implemented language's objects in a single contiguous blocks, and for whom the above 2 downsides of
realloc
would be prohibitive.Efficient coroutine implementations may rely on it. I can see it also useful for databases, larger caches etc.
We already have
memory.grow
that can extend memory withoutrealloc
, but that only allows one such large array, and assumes we have complete control over the allocator and other runtime code that may be using memory, which may be impractical.Question is, how could we allow for this functionality?
mmap
underneath for all Wasm memory, which may be unpractical. There is also the question to what extend memory reserved withmmap
can typically fail after being reserved, which to my knowledge it won't if not called withMAP_NORESERVE
, but I'm no expert.memory.reserve
, which would guarantee address space above the limit thatmemory.grow
has been called. The limit established bymemory.grow
would all be committed memory, and above that, uncommitted until touched. This would also be helpful in the case that reserving can fail and/or not all implementations want to support it (ifmemory.reserve
fails, the program can fall back on a different memory strategy).mmap
like instruction. That is likely unpractical since we may not want to have to deal with "holes" in our memories.I realize browsers may be hesitant to hand out larger parts of address space, but I feel this feature could be important in the future, or for non-browser environments.
@binji @jakobkummerow @sunfishcode