bytecodealliance / lucet

Lucet, the Sandboxing WebAssembly Compiler.
Apache License 2.0
4.07k stars 164 forks source link

lucet-runtime: relax TLS checks when making Instance from Vmctx #654

Closed pchickey closed 3 years ago

pchickey commented 3 years ago

In the implementation of Vmctx, the function instance_from_vmctx is used frequently to access the &mut Instance corresponding to a Vmctx. Prior to this patch, that function would always check that the *const Instance pointer calculated relative to *const lucet_vmctx matches with the instance pointer stored in the thread-local CURRENT_INSTANCE refcell.

The purpose of this check is to provide some assurance that a *const lucet_vmctx passed in from WebAssembly has not been corrupted. It alao helps make sure that the stack-swapping machinery is behaving correctly. We do not believe it is necessary to perform this safety check at every use of Vmctx. Additionally, doing so makes it impossible to use Vmctx methods in future passed to Vmctx::block_on - more on this below.

Repeating this check in nearly every method call to Vmctx &self is belt-and-braces: in safe Rust there should be no way to invalidate the property of Vmctx that was already established during construction in from_raw.

CURRENT_INSTANCE is set by Instance::with_current_instance: the instance pointer is written to the refcell before swapping into the guest stack in Instance::swap_and_return, and cleared after the guest stack swaps back to the host stack.

The two places it seems most critical to perform this safety check are: 1. when constructing a Vmctx::from_raw - taking the possibly-corrupt *const lucet_vmctx passed in from WebAssembly and turning it into a Vmctx, and 2. upon returning to the guest stack from a yield, ensuring the Vmctx on the guest stack still matches the instance used by swap_and_return, even if the thread context has changed (e.g. in an async runtime).

This patch factors the thread-local storage check into instance_ensure_tls and performs the check in only those two cases.

Also, note that I removed a redundant inst.valid_magic() assertion in Vmctx::from_raw - that check is performed in instance_from_vmctx.

Testing

The most natural place where a Vmctx is valid, but CURRENT_INSTANCE is not, is during execution of a future given to Vmctx::block_on.

Therefore, I added a test to the async_hostcall suite where the future given to block_on makes a call to Vmctx::heap_mut, which is implemented in terms of instance_from_vmctx. Before this change, this test would fail by panicking in the tls check. With this change, the test passes and asynchronous code is able to call arbitrary Vmctx methods, e.g. to access embedding ctx or to modify the heap.

pchickey commented 3 years ago

I did this work on a branch where I had marked the block_on futures as non-Send, which is unsound. Dead end, trying again with #655