WebAssembly / wasm-c-api

Wasm C API prototype
Apache License 2.0
550 stars 77 forks source link

Lifetime of `wasm_memory_data` pointer #105

Open sunfishcode opened 5 years ago

sunfishcode commented 5 years ago

The wasm_memory_data function:

https://github.com/WebAssembly/wasm-c-api/blob/7865f7daf895dd38961cf421d631d7e24c84e8b6/include/wasm.h#L465

returns a pointer to the linear memory. However, if a memory.grow or wasm_memory_grow occurs while this pointer remains live, it could cause this pointer to dangle.

Would it be reasonable to add functions such as wasm_memory_load and wasm_memory_store to the API? There would be some parallels to wasm_table_get and wasm_table_set.

Separately, there's the question of whether you'd further want to remote wasm_memory_data altogether. On one hand, you can't eliminate all possibility of undefined behavior when using the C API, but on the other, this is a case where there is a non-obvious corner case which likely to be rare in practice, but not impossible.

rossberg commented 5 years ago

Yes, this API is somewhat dangerous. The pointer is not valid when reused across arbitrary Wasm execution (or API calls to grow) -- if there was documentation, it should say so in big letters. :) Currently the API does not support shared memories, so it should always be safe to acquire the pointer and then use/dereference it directly. For shared memories, this could indeed create a race with a concurrent grow. I'm inclined to say that it is the client's responsibility to synchronise such grows properly, as they'll probably have to synchronise all accesses to shared memory anyway.

These are all the same problems and considerations as for the JS API, except that in JS an unexpected grow will simply neuter the associated typed array and cause unaware accesses to cause an exception. As usual, C's reaction to that is a bit more fatal, but the problem seems exactly the same.

There is a lot of third-party API that a host might want to make available to Wasm that expects a (in or out) pointer to some array. Without being able to hand out pointers (in)to the Wasm memory, that would always require copying. For the Web we decided that's prohibitive.

Yet we could still add various API for accessing the memory object more safely. The same discussion has occurred for the JS API (adding accessors to WA.Memory), I recall, but hasn't led to any concrete proposal. The problem is that it would probably have to be a lot of API surface -- analogues to all the memory, bulk memory, atomic instructions? And it doesn't solve the problem of proper synchronisation per se. Would it buy enough to be worth its weight?