bytecodealliance / lucet

Lucet, the Sandboxing WebAssembly Compiler.
Apache License 2.0
4.07k stars 164 forks source link

Move AsyncContext to field on Instance rather than field on State #644

Open benaubin opened 3 years ago

benaubin commented 3 years ago

Addresses #643. @pchickey

Here's the code diff versus main, prior to the revert of #626: https://github.com/bytecodealliance/lucet/compare/2a026708087ead515b63546c614cebe5e5ec0ff2...benaubin:custom-run-async-future

benaubin commented 3 years ago

Side note: it's kind of ironic that the Arc, which I originally introduced to remove unsafe/potentially unsound code, ended up causing an unsoundness issue.

benaubin commented 3 years ago

Rebased onto main. Squashed all the changes together to make it easier to cherry-pick onto #643.

kaisa001 commented 3 years ago

收到,谢谢! ----- 原始邮件 ----- 发件人:Ben Aubin notifications@github.com 收件人:bytecodealliance/lucet lucet@noreply.github.com 抄送人:Subscribed subscribed@noreply.github.com 主题:Re: [bytecodealliance/lucet] Move AsyncContext to field on Instance rather than field on State (#644) 日期:2021年02月27日 10点55分

Rebased onto main, squashed the changes together, so I could cherry-pick onto #643.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

benaubin commented 3 years ago

Translation: image

pchickey commented 3 years ago

As a quick update - the Lucet folks at Fastly have discussed this and we are keen on removing the asynchronous (signal-based) timeouts feature in favor of synchronous fuel-based timeouts. Our first reason is that making that change will make this code sound: all rust code that is safe ought to be sound, and the way in which this was unsound was surprising to us, the "experts", so how could anyone be confident? The second reason to get rid of asynchronous timeouts is that Wasmtime doesn't have them, and we don't think we want to add them there. Fastly and other Lucet users will need a migration path to Wasmtime and, if that feature can't ever migrate, we should get rid of it.

benaubin commented 3 years ago

@pchickey I agree with that rationale. Should there be a new issue to discuss replacing async signals? This PR sidesteps the signal safety issue entirely by moving the Arc to the instance, rather than State.

As a stopgap, forcing State to derive Copy will prevent any field from implementing Drop, preventing unsound behavior. Of course, that would require adjusting the Yielded state to remove Box

benaubin commented 3 years ago

On removing signals: I don't love gas/instruction counting as the sole criteria for stopping instances, because malicious guests can consume a much larger (orders of magnitude more) amount of CPU time (such as by forcing a large number of cache misses), versus benevolent guests.

Would it be sound to call a user-defined function for on_bound_expired? Then, it would be possible to check metrics like CPU time without having to pay a context switch each time.

Note: no matter what, signals will need to be handled by the instance. Especially SIGFPE/SIGSEGV, which could maybe be caused by a faulty guest, and should (probably) be handled with guest termination.

benaubin commented 3 years ago

Also, in case it helps review, here's the code diff versus the commit on main immediately prior to the revert of #626: https://github.com/bytecodealliance/lucet/compare/2a026708087ead515b63546c614cebe5e5ec0ff2...benaubin:custom-run-async-future