bytecodealliance / wasmtime

A fast and secure runtime for WebAssembly
https://wasmtime.dev/
Apache License 2.0
15.25k stars 1.29k forks source link

Is the mmap memory base always guaranteed to have the same address all the time? #9334

Open stevefan1999-personal opened 1 week ago

stevefan1999-personal commented 1 week ago

I wonder during memory grow reallocation case, would wasmtime::Memory::data_ptr() be invalidated? Right now I'm trying to do something like this:

    #[qjs(get, enumerable)]
    pub fn buffer<'js>(&self, ctx: Ctx<'js>) -> Result<ArrayBuffer<'js>> {
        // let _data = self.inner.data_mut(self.store.lock().unwrap().as_context_mut());
        // Err(ctx.throw("TODO".into_js(&ctx)?))

        let val = unsafe {
            let val = qjs::JS_NewArrayBuffer(
                ctx.as_raw().as_mut(),
                data.as_mut_ptr(),
                data.len().try_into().unwrap(), 
                None, // No need for deallocation
                ptr::null_mut(), // No opaque
                1, // True = Shared
            );
            if qjs::JS_VALUE_GET_NORM_TAG(val) == qjs::JS_TAG_EXCEPTION {
                return Err(rquickjs::Error::Exception);
            }

            Value::from_js_value(ctx.clone(), val)
        };

        let buffer = ArrayBuffer::from_object(Object::from_value(val)?).unwrap();

        Ok(buffer)
    }

But if the user called grow(delta) subsequently, would that invalidate the data.as_mut_ptr() address and render the whole ArrayBuffer invalid?

stevefan1999-personal commented 1 week ago

I'm afraid it is a yes.

Grows this WebAssembly memory by delta pages.

This will attempt to add delta more pages of memory on to the end of this Memory instance. If successful this may relocate the memory and cause Memory::data_ptr to return a new value. Additionally any unsafely constructed slices into this memory may no longer be valid.

On success returns the number of pages this memory previously had before the growth succeeded.

Note that, by default, a WebAssembly memory’s page size is 64KiB (aka 65536 or 216). The custom-page-sizes proposal allows Wasm memories to opt into a page size of one byte (and this may be further relaxed to any power of two in a future extension).

Thus maybe I have to set it as a memory view in the ArrayBuffer property of rquickjs to make it read-only first: ArrayBuffers, while this can still cause access violation if the buffer is detached. I'm not sure what is the best way to implement this interface though. I guess this would be platform specific?

alexcrichton commented 1 week ago

Yes memory can move under certain configurations of Wasmtime or for certain input modules. It looks like you're hooking up Wasmtime as the wasm implementation for a JS engine (quickjs I think?), and if that's the case it's likely going to be relatively tricky to do so. You're already afield of Wasmtime's/Rust's safety guarantees by allowing access of underlying memory outside of the lock of a Store<T>, so you'll have to be careful to juggle that to make sure an ArrayBuffer never aliases an operation on a memory, for example.

I might recommend implementing your own MemoryCreator where the LinearMemory trait can provide you the hook you need to know when growth happens so you know when to invalidate the ArrayBuffer.

stevefan1999-personal commented 1 week ago

Oh yes, I see at least in Windows and *nix, for data_ptr to change is only possible when the requested grow is beyond the existing preallocated max (if supplied), which means it apparently has a bigger size than the VirtualAlloc in the initial memory creation, for example, because it uses VirtualAlloc so if the grow size is within range, we can just use VirtualProtect and add RWX bit back on it to get a congruent memory.

cfallin commented 1 week ago

Regardless of underlying implementation details, what Alex is getting at (and what I would re-emphasize) is the difference between incidental details and API-contractual details. We don't guarantee anything about when the base pointer might change; and we might change those details in the future, if there is a good reason to do so. The only way one should use the public Wasmtime API is by assuming it could change after any memory-grow event (thus any Wasm execution), or as Alex notes, using the other interface we have for this purpose, which lets you control the details of the memory implementation by owning the memory.

Said more succinctly: please don't bake in dependencies on our internal implementation details, because they may change!