bytecodealliance / lucet

Lucet, the Sandboxing WebAssembly Compiler.
Apache License 2.0
4.06k stars 164 forks source link

Add a "bounded runtime" mode to guest execution. #612

Closed cfallin closed 3 years ago

cfallin commented 3 years ago

This modifies the instruction-counting mechanism to track an instruction-count bound as well; and when a bound is set, the Wasm guest will make a hostcall to yield back to the host context.

This is all placed under the run_async() function, so bounded-execution yields manifest as futures that are immediately ready to continue execution. This should give an executor main loop a chance to do other work at regular intervals.

cfallin commented 3 years ago

Just made a few tweaks to (hopefully) get a green CI again; recent changes in wasmtime mean we'll need to fix a few things in Lucet before merging this PR.

  1. The GuestMemory API changed; @pchickey I think the impl of this trait in Lucet needs to be updated as well?
  2. There were some build errors related to hostcall types; @fst-crenshaw this is I think related to your work in bytecodealliance/wasmtime#2419 -- will there be a Lucet update as part of that work too?

I'm happy to do either or both of these if needed, just let me know and I'll dig in further to understand the changes better :-) Thanks!

iximeow commented 3 years ago

@cfallin i've cooked up a patch for point 1, and am figuring out point 2 at the moment - i expect we can piece those changes out with a wasmtime update independent of this PR landing, so we don't have to worry about tying up too much in one PR

cfallin commented 3 years ago

Rebased on top of #614; this should be ready for review now (thanks @iximeow for the quick work on that PR!).

acfoltzer commented 3 years ago

Ah, it occurred to me that we also need a version of Instance::run_start() that has the same bounded time properties. @cfallin would you be able to add this before we merge?

cfallin commented 3 years ago

Updated, and added the async Instance::run_async_start() as well.

benaubin commented 3 years ago

Curious - would it be possible to expose these APIs publicly?

cfallin commented 3 years ago

@benaubin run_async() is a public method on InstanceHandle already -- or did you mean some other part of the API?

benaubin commented 3 years ago

@cfallin I'd love to have access to the InternalRunResult without using run_async in order to set an approximate CPU usage limit (a limit higher than reasonably expected, mostly to stop a while true {}). We plan to bill on CPU time, and can't do so with the run_async API.

cfallin commented 3 years ago

@benaubin I see -- the InternalRunResult is an implementation detail (it's how the yield loop in run_async() is made aware of time-bound yields), so I don't think it makes sense to expose that type as-is.

If you are not also using the value-yield functionality in hostcalls, you can implement your use-case with run_async() as-is, I think: manually inspect the returned future and consider the result to be a timeout if it is not ready after the first invocation.

Otherwise, the right thing to do would be to have a separate run_with_timeout() API, I think: this is semantically different than the existing run methods, all of which run to completion (possibly with yields) and so its type needs to be different.

If the latter is what you need, I'm happy to review a PR! Unfortunately I don't have time to work on this myself at the moment though.

benaubin commented 3 years ago

Happy to do a PR. Renaming InternalRunResult to BoundedRunResultand exposing it would provide a perfect run_with_timeout API as-is.

I'd be happy to build a timeout around CPU time as well, but it would require adding calls to libc for measuring the thread's cpu time - bloat that might not be applicable to all users.

benaubin commented 3 years ago

Oh, I see what you mean! There would need to be a run method that returns BoundedRunResult. I'll add a run_bounded method.

cfallin commented 3 years ago

@benaubin One other thing that I should probably note also, is that longer-term, the plan is to merge Lucet's functionality into Wasmtime (see blog post from Bytecode Alliance and another blog post on the merging efforts). There are active efforts to make this happen, e.g. async support in bytecodealliance/rfcs#2 and fast instance allocation in bytecodealliance/rfcs#5. In the meantime Lucet is absolutely supported, and as mentioned I'm happy to review a PR; but this may be relevant for your planning :-)

benaubin commented 3 years ago

Thanks for the heads up! We saw that, but want lucet's Send instances. Our host call surface is purposefully minimal (instances mostly communicate with an external API server) - which should make switching less painful (and sandboxing slightly easier).

benaubin commented 3 years ago

@cfallin I've put an initial draft PR together (#625). Adds a few _bounded methods, and replaces InternalRunResult with an additional Error variant.

benaubin commented 3 years ago

@cfallin On second thought, you were probably right that the InternalRunResult should remain a private implementation detail. I refactored the run_async_impl into a custom Future implementation, which gives the same benefit as exposing the bounded runtime apis, without complicating the API surface. Could you review PR #626?