WebAssembly / wabt

The WebAssembly Binary Toolkit
Apache License 2.0
6.9k stars 702 forks source link

wasm2c: Use wrappers for function references #2465

Closed SoniEx2 closed 2 months ago

SoniEx2 commented 2 months ago

Clang 17(?) tightened UBSAN checks, so that you now get this:

- test/wasm2c/spec/call_indirect.txt
  expected error code 0, got 1.
  STDERR MISMATCH:
  --- expected
  +++ actual
  @@ -0,0 +1,3 @@
  +out/test/wasm2c/spec/call_indirect/call_indirect.0.c:2144:12: runtime error: call to function w2c_call__indirect__0__wasm_f0 through pointer to incorrect function type 'unsigned int (*)(void *)'
  +/home/runner/work/wabt/wabt/out/test/wasm2c/spec/call_indirect/call_indirect.0.c:1925: note: w2c_call__indirect__0__wasm_f0 defined here
  +SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior out/test/wasm2c/spec/call_indirect/call_indirect.0.c:2144:12 
  STDOUT MISMATCH:
  --- expected
  +++ actual
  @@ -1 +0,0 @@
  -134/134 tests passed.

This happens because emitted functions use a typed module instance, while function references use a void* instance. It is UB in C to call the former with the latter, so clang is correct here.

We had to pick one of two ways to fix this: either emit void* wrapper functions that do the appropriate downcasting for any module functions that go into a table (potentially including imported functions), or the approach that takes significantly less effort of changing everything to void* and downcasting internally. We obviously chose the latter. We eventually started emitting wrapper functions.

SoniEx2 commented 2 months ago

cc @shravanrn (can you take a look if this doesn't break anything with the segue stuff?) @keithw (can you take a look like, in general? tell us if this seems like the wrong approach...) @sbc100

sbc100 commented 2 months ago

Perhaps we can use some kind of annotation to tell UBSAN to ignore these call_indirect mismatches?

SoniEx2 commented 2 months ago

This appears to be relevant reading material: https://github.com/python/cpython/issues/111178

If we are understanding this right, this kind of thing is important for "control-flow integrity", a hardware feature(?) to prevent exploits. It would seem ill-advised to ignore it in wasm2c.

keithw commented 2 months ago

Is it feasible to instead change the CALL_INDIRECT macro to generate the correct type signature (instead of making everything void*)?

SoniEx2 commented 2 months ago

No, tables can change at runtime, even with dynamic linking (think dlopen()ing wasm2c'd modules). We don't know the type of the module instance, and we can't know all of the possible module instance types.

It's either this, or wrapper functions.

The issue isn't the argument types - CALL_INDIRECT already casts those correctly. It's the module instance.

shravanrn commented 2 months ago

cc @shravanrn (can you take a look if this doesn't break anything with the segue stuff?)

This should have no interaction with segue. Also the CI already includes a segue test, so it should (hopefully) be enough if we pass CI

If we are understanding this right, this kind of thing is important for "control-flow integrity", a hardware feature(?) to prevent exploits. It would seem ill-advised to ignore it in wasm2c.

So I assume you are referring to clang's software-enforced (and in the future possibly partially-hardware-backed) control-flow integrity (CFI). It's actually entirely fine to ignore clang CFI in wasm2c, as Wasm in effect has its own CFI built-in (the table's function type checks in Wasm are basically the same as Clang's CFI). Arguably, clang's CFI is slightly more fine-grained, but imho not sufficiently beneficial to make a significant change over.

To be clear this is different from whether we should worry about UB or not. In general, we should really try to avoid UB for reasons that have nothing to do with CFI support.

On the current PR : As a general rule, I am not a huge fan of regressing any strong typing guarantees we have in the code base. This is one of the few things preventing us from accidentally adding bugs as we add features. If anything, I think we should move to more strong types and less voids (or general types) where possible.

Given all this, I think the right approach for this PR is probably one of the other options suggested earlier

SoniEx2 commented 2 months ago

Either modify CALL_INDIRECT to call a wrapper function which does add the cast

this is proving to be more difficult than expected, and we didn't expect it to be easy...

and we noticed tail calls don't suffer from this issue. do they use wrapper funcs? no, they just use void*! (we're not touching that. don't fix what isn't broken and all that.)

SoniEx2 commented 2 months ago

rebased and rewritten

SoniEx2 commented 2 months ago

@sbc100 ping?