WebAssembly / wasi-libc

WASI libc implementation for WebAssembly
https://wasi.dev
Other
836 stars 199 forks source link

global dlmalloc structures clobbered without warning by large stack variable #233

Open aidanhs opened 3 years ago

aidanhs commented 3 years ago

(this probably doesn't belong here, but I couldn't decide which repo to raise it on)

Steps to reproduce:

  1. clone https://github.com/aidanhs/wasm-stack-clobber
  2. enter the repo directory
  3. download and untar wasi-sdk-12.0
  4. clone commit f2e779e5f1ba4a539937cedeeaa762c1e0c162df of wasi-libc and build it using that sdk
  5. download and untar wasmtime-v0.22.1-x86_64-linux
  6. run ./script.sh

You will see output like:

$ ./script.sh 
[...compilation...]
+ ./wasmtime-v0.22.1-x86_64-linux/wasmtime py.wasm
calc 3 0x150b0
zip_path 0x4030-0x8034 /x.zip 6
calc 7 0x15090 to 0x34490 32000
we have an overlap!

This is identifying that there is an overlap the allocations of _pythonpath (calc 3 line) and buf (calc 7 line) in https://github.com/aidanhs/wasm-stack-clobber/blob/master/interp-py.c#L79-L123. Both of these are mallocs.

For more information, now do cd wasi-libc && git apply ../wasi-libc.patch and rebuild wasi-libc. Additionally, edit script.sh to uncomment -DCUSTOM_LIBC. Now run ./script.sh again:

[...]
+ ./wasmtime-v0.22.1-x86_64-linux/wasmtime py.wasm
libc malloc 0x15120 8
libc calloc 0x15130 8
libc malloc 0x15140 48
calc 3 0x15140
inspecting (dlmalloc is at 0x4ee4):
    start: 0x15120, end: 0x15128, used: 12, arg: 0
    start: 0x15130, end: 0x15138, used: 12, arg: 0
    start: 0x15140, end: 0x15178, used: 60, arg: 0
    start: 0x15198, end: 0x1ffc8, used: 0, arg: 0
inspecting (dlmalloc is at 0x4ee4):
    start: 0x15120, end: 0x15128, used: 12, arg: 0
    start: 0x15130, end: 0x15138, used: 12, arg: 0
    start: 0x15140, end: 0x15178, used: 60, arg: 0
    start: 0x15198, end: 0x1ffc8, used: 0, arg: 0
zip_path 0x40c0-0x80c4 /x.zip 6
inspecting (dlmalloc is at 0x4ee4):
libc malloc 0x15120 32000
calc 7 0x15120 to 0x34520 32000
we have an overlap!
inspecting (dlmalloc is at 0x4ee4):
    start: 0x15120, end: 0x1ce28, used: 32012, arg: 0
    start: 0x1ce48, end: 0x1ffc8, used: 0, arg: 0

This printout is executing dlmalloc_inspect_all at key points of execution - you can see that dlmalloc reports all the allocations vanish after wcsncpy is executed. You can see in the printout that the dlmalloc global structure _gm_ is located within the bounds of the stack-allocated zip_path, so ends up being clobbered as part of wcsncpy.

Though I struggled to find documentation on how stack/heap/data are co-located in a memory space, I found a diagram on https://surma.dev/things/c-to-webassembly/ "The layout that wasm-ld uses is the following:" which seems to indicate the stack can easily overflow into the data section. This feels like it could be the cause of my issue (assuming that there isn't some undefined behaviour in my testcase I haven't spotted that explains it all - entirely possible!).

I can fix this by adding -Wl,-z,stack-size=$((8*1024*1024)) -Wl,--initial-memory=$((32*1024*1024)) to my final link, but I'm a bit frustrated that I couldn't find any way to figure out what was going on short of ripping apart the Python runtime to try and get a reduced test case. Two questions:

  1. is this a bug? i.e. should I have expected there to be something that aborts on stack overflow?
  2. if this is not a bug, is there an opportunity to change the layout so data is located in the 'middle' of the memory with stack and heap growing outwards? This means heap+stack growth will trap on overflow
sbc100 commented 3 years ago

You are correct about the default memory layout. That was chosen deliberately in order to give static address very small absolute values in order to minimize code size (the i32.const <CONST> instruction takes fewer bytes when const is small).

You can change the default by passing --stack-first to the linker (or -Wl,--stack-first to the compiler).

sunfishcode commented 3 years ago

Ideally, clang should also have an option to emit code which checks for overflow when incrementing the stack. I started some work in this direction in https://reviews.llvm.org/D70515, however I've not had time to pursue it further.

Paril commented 3 years ago

I've been banging my head against the wall with a random crash after random amounts of time for the past few days. Turns out my issue is here!

I'm surprised that it's possible to compile code that ends up having overlapping stack variables. Is there nothing that can be done to automatically detect this? It seems more like a Clang/LLVM issue more than anything, if it should know at compile time that stack will overflow into data section.

sunfishcode commented 3 years ago

As @sbc100 mentioned, you can pass -Wl,--stack-first to clang.

Beyond that, what's needed is stack overflow checking in LLVM's wasm backend. https://reviews.llvm.org/D70515 is a possible starting point, if anyone wants to work on this. In general it's not possible to detect stack overflow at compile time, because stack allocation happens at runtime, but it is possible to have it check for overflow when adjusting the stack pointer.

Paril commented 3 years ago

@sunfishcode Moving the stack to the front would just cause the crash to happen at the beginning of data rather than the end, wouldn't it?

sunfishcode commented 3 years ago

It means that overflowing the stack isn't likely to clobber global variables without trapping, because most likely it'll wrap around to 4GiB which most often won't be allocated, so accessing the memory there will likely trap.

kripken commented 3 years ago

Another option here is to use Binaryen's StackCheck pass which Emscripten's stack checks are based on. It instruments all assignments to the stack pointer to check for an overflow. Integration with wasi-libc should be simple, basically the pass adds a function __set_stack_limits which the runtime needs to call with two parameters, the top and bottom of the stack.

sunfishcode commented 3 years ago

We don't need new ad-hoc agreements between runtimes and applications; we just need to improve our tools.

Paril commented 3 years ago

@sunfishcode Oh I see what you mean, gotcha. I'm surprised that stack-first isn't the default, to be honest.

aidanhs commented 3 years ago

A followup question (or perhaps feedback): how could I have discovered the stack-first, stack-size and initial-memory flags outside of searching issues? https://webassembly.org/ has a number of links to reference-level stuff, as well as some basic tutorials (though they use emscripten). https://wasi.dev/ (and https://github.com/bytecodealliance/wasmtime/blob/main/docs/WASI-documents.md) also have a number of links to reference level documentation. But I'm not sure where my entry point for 'in the trenches' documentation in the WASM world is when I'm trying to build stuff (particularly with wasi-sdk).

As a contrast, coming from emscripten, I can go to https://emscripten.org/ and know that the vast majority of things I'm interested in are either covered there, or linked directly from there - it's very oriented around "things I might want to do" and I've personally found it excellent for that (particular nod at https://emscripten.org/docs/getting_started/FAQ.html and https://emscripten.org/docs/porting/index.html).

sbc100 commented 3 years ago

A followup question (or perhaps feedback): how could I have discovered the stack-first, stack-size and initial-memory flags outside of searching issues? https://webassembly.org/ has a number of links to reference-level stuff, as well as some basic tutorials (though they use emscripten). https://wasi.dev/ (and https://github.com/bytecodealliance/wasmtime/blob/main/docs/WASI-documents.md) also have a number of links to reference level documentation. But I'm not sure where my entry point for 'in the trenches' documentation in the WASM world is when I'm trying to build stuff (particularly with wasi-sdk).

If you google "wasm linker flags" it should take you directly to the wasm-ld linker docs: https://lld.llvm.org/WebAssembly.html Perhaps wask-sdk could link to this page (along with the clang flags) somewhere, but we certainly wouldn't want to duplicate it.

Bear in mind that this is toolchain specific stuff. More specifically its clang/llvm specific toolchain stuff. As such, its not related to core WASI or core WebAssembly.

As a contrast, coming from emscripten, I can go to https://emscripten.org/ and know that the vast majority of things I'm interested in are either covered there, or linked directly from there - it's very oriented around "things I might want to do" and I've personally found it excellent for that (particular nod at https://emscripten.org/docs/getting_started/FAQ.html and https://emscripten.org/docs/porting/index.html).

sunfishcode commented 3 years ago

I agree that it would be good to document more things. I've now submitted https://github.com/bytecodealliance/wasmtime/pull/2703 to document -Wl,--stack-first.