Support for reserving address space?

aardappel commented 4 years ago

With a 64-bit address space, we don't just get to break the 4GB barrier, but potentially new programming techniques become available, in particular techniques that rely on using larger parts of the address space, i.e. the ability to reserve memory that does not cause physical memory to be used unless touched (possible on native platforms with mmap or VirtualAlloc).

This is particularly useful in working with large data sets, being able to reserve large arrays, and being able to work with them and:

avoid the cost of realloc / memcpy as it grows.
allowing internal pointers that don't get invalidated by those reallocs.
can usually omit bounds checking at each pointer bump / push_back.

This can be particularly useful for implementing programming languages, that like to manage the implemented language's objects in a single contiguous blocks, and for whom the above 2 downsides of realloc would be prohibitive.

Efficient coroutine implementations may rely on it. I can see it also useful for databases, larger caches etc.

We already have memory.grow that can extend memory without realloc, but that only allows one such large array, and assumes we have complete control over the allocator and other runtime code that may be using memory, which may be impractical.

Question is, how could we allow for this functionality?

Radical but simple: we could enable this by default, by simply specifying that untouched pages of Wasm memory do not cause physical memory pressure. This would require all implementations to use mmap underneath for all Wasm memory, which may be unpractical. There is also the question to what extend memory reserved with mmap can typically fail after being reserved, which to my knowledge it won't if not called with MAP_NORESERVE, but I'm no expert.
Add an explicit new instruction, like memory.reserve, which would guarantee address space above the limit that memory.grow has been called. The limit established by memory.grow would all be committed memory, and above that, uncommitted until touched. This would also be helpful in the case that reserving can fail and/or not all implementations want to support it (if memory.reserve fails, the program can fall back on a different memory strategy).
Add an actual mmap like instruction. That is likely unpractical since we may not want to have to deal with "holes" in our memories.

I realize browsers may be hesitant to hand out larger parts of address space, but I feel this feature could be important in the future, or for non-browser environments.

@binji @jakobkummerow @sunfishcode

sunfishcode commented 4 years ago

The maximum field of a linear memory addresses is also related. The idea is that it can be impractical for VMs to move linear memory within the host address space to service a memory.grow if there are other threads running and using the memory, so the maximum field is meant to allow implementations to reserve the maximum address space up front, and then allocate actual memory within that address space as needed.

lukewagner commented 4 years ago

Another thing to consider is that, even with the memory64 proposal as-is, when you memory.grow, until the newly allocated wasm memoryis touched by wasm load/store instructions, the OS is usually not assigning any physical pages. The only thing eagerly allocated in the kernel is some sort of descriptor that claims the virtual address range. Thus, you roughly get the performance characteristics you're asking for out-of-the-box.

aardappel commented 4 years ago

@lukewagner that really depends on how it is implemented, and on the OS.

For example, on Windows, your lowest level allocator would typically acquire memory from the OS with VirtualAlloc and MEM_COMMIT set (since without that flag, you don't get the "allocate physical memory on access" behavior, but a page fault). Committing however, means you are allocating address space that is capped by the total amount of physical memory + page file size, shared by all processes, so this is very limited. This does not allow "address space reservation" the same way as it can be done on Linux, unless explicit guard pages are set (which assume linear increasing access).

Now that I write this, I am actually doubting the functionality I am suggesting is even possible on Windows.. unless we specify that memory.reserve memory must be accessed linearly to commit, which is probably too "hacky" a feature to be part of Wasm. So unless anyone has any bright ideas, we can close this.

Though I guess making it part of the spec somehow that "new memory made available thru memory.grow can reasonably be expected to not use physical pages until accessed" would leave slightly more room for people to decide to ask for the maximum memory they may need with no performance consequences.

I am guessing the maximum @sunfishcode refers to would typically be implemented by VMs with MEM_RESERVE on Windows (address space allocated, but page faults if accessed) and MAP_NORESERVE on Linux (doesn't page fault, but no page file backing, so may SIGSEGV on access if that runs out).

aardappel commented 4 years ago

Updated the original post, realizing there's a 3rd reason for this being efficient: being able to omit check on pointer bumps.

And I think its useful beyond programming language runtimes, in e.g. databases.

lukewagner commented 4 years ago

@aardappel Ah hah, good point; I forgot that Windows didn't have overcommit. In that case, then to achieve your bullet one option (which seems like the most attractive option to me), a wasm impl either needs an OS with overcommit or an OS that provides the ability to:

allocate pure address space (VirtualAlloc(MEM_RESERVE))
handle page faults (AddVectoredExceptionHandler)
convert previously-reserved address space to committed from the signal handler (VirtualAlloc(MEM_COMMIT))

One thing I haven't seen discussed in general is whether we assume the memory64 feature will be available on all hosts (even 32-bit ones, where memory64 would be fairly slow due to it being forced to use emulated i64s everywhere), or whether it's a permanently-optional feature. If it's permanently optional, then it might not be bad for memory64 to have the above OS requirement, since not all hosts need to support memory64 anyway.

aardappel commented 4 years ago

@lukewagner yup, that's pretty much the only way it would work on Windows.

I was actually curious how it would work on Windows and couldn't find a clear example online, so created a prototype that would work thru a page fault handler myself: https://github.com/aardappel/stackalloc/blob/master/stackalloc.cpp And yes, it works rather well.. it was able to reserve and access multiple 64GB memory blocks at no apparent cost.

Actually difference is that this code uses PAGE_GUARD (which requires commit) to catch new pages being accessed, and you are suggesting directly catching access to reserved-only pages. I can try and see if that works also.

As for wether 32-bit hardware platforms should support memory64:

Such platforms are increasingly going to be IOT/embedded only, so are specialized enough to demand binaries must be compiled for wasm32, and may benefit complexity wise not supporting it.
Any code reserving very large blocks of address space is going to fail on 32-bit hardware anyway, even if it attempts to support memory64.
At the same time, as pretty much every piece of hardware people build for is 64-bit, in say 5 years from now people may start to build for memory64 by default simply because it simplifies their life having only to think one size data type (size_t and friends), and 32-bit support may "atrophy"?

aardappel commented 4 years ago

Ok, verified that committing without guard pages also works on Windows, which allows random access to the reserved address space.

So yes, this would allow an engine to use this for memory.grow, and allow that to grow to sizes significantly bigger than physical / page file memory at no cost. Upon exception, you can choose how much memory to commit beyond the page that was actually touched, so the cost of this incremental committing can be kept arbitrarily low too.

binji commented 4 years ago

I wonder if the 2nd option (adding explicit memory.reserve) is better here. The problem I can see with the 1st option is that it means that loads/stores may trap, which can't currently. If we add memory.reserve, then we know that the user has opted in to this behavior.

aardappel commented 3 years ago

Just noting that this feature, if there is interest, is intended for post-MVP.

binji commented 3 years ago

Just noting that this feature, if there is interest, is intended for post-MVP.

(meta-comment: maybe create a "post-mvp" label, and add it for issues like these?)

WebAssembly / memory64

Support for reserving address space? #4