WebAssembly / memory-control

A proposal to introduce finer grained control of WebAssembly memory.
Other
21 stars 2 forks source link

What should happen on memory.unmap? Should a subsequent memory access trap? #11

Open dtig opened 1 month ago

dtig commented 1 month ago

We've previously considered zero filling on discard, and unmap, but there are several reasons this doesn't seem like the right option.

  1. Memory accesses to discarded pages will recommit the memory, which doesn't address the reason discard was originally proposed - for improving memory utilization
  2. On windows, there's no good way to zero fill in atomically
  3. For unmap, what is more useful from the point of view of an application? Trapping on inaccessible memory would work, and is easier for implementations.

We've previously decided against this to avoid trapping in the middle of memory, what are the implications of doing so? Opening this issue as a catch-all for discussing trapping vs. zero filling.

dschuff commented 1 month ago

One use case for unmapping is to catch bugs in programs (e.g. null-pointer dereferences). For that use case, zero-filling isn't useful. Trapping might be OK, since it's fine for the access to be fatal and not catchable inside wasm (modulo implementing a debugger or virtualization inside of wasm).

Another use case is for implementing null-pointer exceptions in Java and .NET: i.e. rather than instrumenting memory references, a common implementation strategy is to unmap the zero page, and then catch the OS signal from the memory reference and resume it. Just having memory.umap might not be enough to implement this today; we could imagine allowing memory references to throw, but resumption would still be a problem. Maybe once we get stack switching we could combine unmapping with some sort of resumable continuation or signal to be handled on a separate stack. I wouldn't want to preclude that here, but probably having a trap now could be generalized later.

dschuff commented 1 month ago

But to attempt to actually shed some light on the question, don't we already trap in the middle of memory? We have the combination of memory references that can cross page boundaries, bulk memory instructions, and atomics already, and those can all already race with memory growth and trap and leave various types of observable effects behind. So I think the only new thing is that there could now be a race between one of these and an unmapping (since we don't currently have memory shrinking). The memory model in the threads proposal (https://webassembly.github.io/threads/core/exec/relaxed.html) discusses tearing and bounds checks and growth and how there could be nondeterministic values, but I'm not sure if shrinking memory would be symmetric to growth under that model.

bvisness commented 1 month ago

I'm not personally clear on how much trapping we allow during races on shared memories, but it would certainly help to clarify a lot of design issues. Our prototype of memory.discard was very straightforward on all operating systems, except for shared memory on Windows, where in order to avoid traps we came up with a silly VirtualUnlock hack. If we are in fact ok with trapping during races, we could revert back to the simple MEM_DECOMMIT, MEM_COMMIT strategy and have much better performance and a simpler implementation.

As for the original question, I am strongly in favor of trapping in mappable memories. I think trapping is generally more desirable for applications (crashing on bad access instead of continuing), and certainly far easier to implement. (Generally, I imagine that requiring zeroes would require us to fill pages with zeroes on demand in a signal handler, since we cannot just commit an entire large memory without hitting commit limits on Windows. Not only would this be a pain, I imagine it would be rather slow.)

I also like the possibilities @dschuff mentioned as far as null pointer exceptions and such. I think that's a good aspirational goal for memory control features in general, and I don't see any way to make them happen if we go with zeroes.

bvisness commented 1 month ago

As for memory.discard - while it would be a useful add-on for "traditional" linear memory, I'm not sure it has much place in mappable memories. You could generally achieve the same effect as memory.discard with an unmap / map, and the latter would make it much clearer what happens if you "discard" unmapped memory, "discard" memory that was mapped from a file descriptor, etc.

I would generally suggest excluding memory.discard from the mappable-memories part of the proposal, and keeping it as a utility for non-mappable memories if at all.

dtig commented 1 month ago

But to attempt to actually shed some light on the question, don't we already trap in the middle of memory? We have the combination of memory references that can cross page boundaries, bulk memory instructions, and atomics already, and those can all already race with memory growth and trap and leave various types of observable effects behind. So I think the only new thing is that there could now be a race between one of these and an unmapping (since we don't currently have memory shrinking). The memory model in the threads proposal (https://webassembly.github.io/threads/core/exec/relaxed.html) discusses tearing and bounds checks and growth and how there could be nondeterministic values, but I'm not sure if shrinking memory would be symmetric to growth under that model.

Phrasing the question slightly differently, should inaccessible memory be treated differently? for bulk memory instructions, and for atomics, the traps are on a size mismatch, i.e. the memory operation performed is going over the target memory, or in the case of atomic operations for operations that violate the natural alignment. But in both cases, all linear memory is still contiguously accessible with a valid memory access. In the case of a data race with memory.unmap, we render parts of the memory inaccessible. So it violates a little bit of what the memory model in the threads proposal has as the text i.e. with trapping on previously accessible memory we are in the fully undefined bucket and not defined but non-deterministic bucket, or at least that's how I'm parsing it.