wasmerio / wasmer

🚀 The leading Wasm Runtime supporting WASIX and WASI
https://wasmer.io
MIT License
18.9k stars 809 forks source link

Different wasm execution result when StackOverflow error occur (same platform) #4181

Open grishasobol opened 1 year ago

grishasobol commented 1 year ago

Description

Hi! I have meet some problems using wasmer for out blockchain platform. We must support strict equal wasm programs execution on different platforms and uses for that case wasmer. But unfortunately meet difference of this programs behaviour even on one platform when error StackOverflow occur.

I understand that it's impossible (or can be too perf overhead) for wasmer to support the same behaviour in case stack is overflowed on different platforms, but in this issue I've met this problem for two binaries running on one platform, but compiled with different linker flags.

Note: Just wanna be sure that this is supposed to be normal situation.

Steps to reproduce on MacOs and Ubuntu

MacOs environment:

Darwin Kernel Version 22.6.0 RELEASE_ARM64_T6000 arm64 M1

Ubuntu environment:

Linux #26~22.04.1-Ubuntu SMP x86_64 x86_64 x86_64 GNU/Linux
  1. See my repository https://github.com/grishasobol/wasmer-fail.
  2. Clone repository
    git clone https://github.com/grishasobol/wasmer-fail.git
    cd wasmer-fail
  3. Running --release tests (I'm using default linker)
    cargo test -v -p wasmer-fail --profile release -- check_wasm_execution_stack2 --nocapture
  4. Running production (no changes also default linker but with lto)
    cargo test -v -p wasmer-fail --profile production -- check_wasm_execution_stack2 --nocapture
  5. For release and production you will have different output:

For MacOs

Profile release: I64(13100)
Profile production: I64(13101)

For Ubuntu

Profile release: I64(16377)
Profile production: I64(16378)

Conclusion

As you can see execution result is different on different platforms, but what is much more important here: result is different for binaries compiled with different optimisation flags.

ptitSeb commented 1 year ago

There is a structural difference between macOS on ARM64 and Linux/x86_64: the page size. My understand is that Linux x86_64 will have 4K page, while macOS/ARM64 will have 16K page. Thus the guard page of the stack will have different size, and a difference of behaviour can hardly be avoided here.

For the profile issue, I don't understand why that would change the outcome of the wat execution. This need some analysis.

grishasobol commented 1 year ago

Different page size cannot cause this problem because difference will be only in guard page size, not in stack size. I think the difference between arm and x86 is caused by different stack usages in generated machine code.

Michael-F-Bryan commented 1 year ago

Currently, our VM and compilers treat handling of stack overflows as implementation defined and we don't make any guarantees that the same program running on different systems will overflow their stack in the same way.

We currently use the same stack for both WebAssembly and runtime code, so enforcing consistent stack overflows across all systems would require a lot of refactoring of the VM and all compiler backends. This isn't impossible, but it's more work than we have bandwidth for at the moment.

stale[bot] commented 2 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.