bytecodealliance / wizer

The WebAssembly Pre-Initializer
Apache License 2.0
942 stars 55 forks source link

wizer.initialize can't read file even if --allow-wasi --dir . specified? #24

Open dkegel-fastly opened 3 years ago

dkegel-fastly commented 3 years ago

Given the program

package main
import  "os"
var theText string
//go:export wizer.initialize
func WizerInitialize() {
        buf, err := os.ReadFile("test.txt")
        if err != nil {
                println("error: Cannot read test.txt", err.Error())
        }
        theText = string(buf)
}
func main() {
        if theText == "" {
                WizerInitialize()
        }
        println("theText", theText)
}

This succeeds, and outputs foobar as expected:

tinygo build -target=wasi -scheduler=none -wasm-abi=generic -o main.wasm main.go
echo "foobar" > test.txt
wasmtime --dir . main.wasm

but this

wizer --allow-wasi --dir . -o out.wasm < main.wasm

fails with

error: Cannot read test.txt open test.txt: errno 76

Setting RUST_LOG=debug shows that it is at least preopening the directory:

[2021-07-20T16:12:44Z DEBUG wizer] Validating input Wasm
[2021-07-20T16:12:44Z DEBUG wizer] Preparing input Wasm
[2021-07-20T16:12:44Z DEBUG wasmtime_cache::worker] Cache worker thread started.
[2021-07-20T16:12:44Z DEBUG wasmtime_cache::worker] New nice value of worker thread: 3
[2021-07-20T16:12:44Z DEBUG wizer] Validating the exported initialization function
[2021-07-20T16:12:44Z DEBUG wizer] Calling the initialization function
[2021-07-20T16:12:44Z DEBUG wizer] Preopening directory: .
[2021-07-20T16:12:44Z DEBUG wizer] Creating dummy imports
error: Cannot read test.txt open test.txt: errno 76

Passing a bad directory to --dir does abort, so it is at least opening the directory.

This is on mac, so I have not gone through the agony of using dtruss to see if it is actually trying to open a file.

dkegel-fastly commented 3 years ago

Here's main.wasm zipped: main.wasm.zip

fitzgen commented 3 years ago

I'm not seeing anything particularly enlightening under strace. Because of that, my current hypothesis is that something is happening in user space that doesn't involve syscalls, and that the tinygo wasi implementation (correctly? incorrectly? not sure) doesn't recognize "test.txt" as being within the preopened directory ".", so it fails because it doesn't know how to find that file.

Can you try modifying the program to read `"./test.txt" and see whether that works or not?

dkegel-fastly commented 3 years ago

already tried that, no joy.

Note that tinygo wasi does recognize test.txt when run without wizer (as demonstrated).

fitzgen commented 3 years ago

Updating wasmtime from 0.27 to 0.28 did not fix this issue, fwiw.

fitzgen commented 3 years ago

If I tell Wizer that the initialization function is the _start function via --init-func _start, then the wasm module runs just fine (and then Wizer snapshots the state after its execution finishes). Of course, we don't want to run the whole program during wizening, but this suggests that the issue is not the WASI context, as I originally suspected, and is instead something to do with the wasm itself.

I suspect that tinygo is producing wasm modules whose exported functions assume that global constructors have run and that the tinygo runtime is already initialized. We hit this with C++, where we have to make sure that we call global constructors before continuing with application-level initialization (Rust does not have global constructors, and therefore avoids this issue). Instead, I think tinygo is assuming that _start is the very first thing that is ever called, and does its runtime/global ctor initialization there. That assumption is violated by Wizer's execution model.

I wonder if this program (modulo any simple errors I may have written, as a non-gopher) works with Wizer?

package main

var theText string

//go:export wizer.initialize
func WizerInitialize() {
        theText = "initialized!"
}

func main() {
        println("theText: ", theText)
}

Even better would be a wizer.init that initializes a global using something that definitely relies on the runtime/global ctors having already been initialized, but I'm not familiar enough with go/tinygo to know what that something might be.


So if my new hypothesis is correct there are two ways to fix this:

Both of these are tinygo toolchain issues, but once we figure it out, we should document it and have examples in here.

dkegel-fastly commented 3 years ago

That little test program does print out theText: initialized!

FWIW reading from stdin is working out as a workaround.