tetratelabs / wazero

wazero: the zero dependency WebAssembly runtime for Go developers
https://wazero.io
Apache License 2.0
4.86k stars 255 forks source link

Document best practices around invoking a wasi module multiple times #985

Open pims opened 1 year ago

pims commented 1 year ago

Is your feature request related to a problem? Please describe. Not a problem per se, just questions around writing production-grade code with Wazero.

Context: I have a simple Rust program that looks like this:

use prql_compiler::compile;
use std::io::{self, Read};

fn main() {
    // example input is "from employees | select [name,age]  ";

    let mut prql = String::new();
    let stdin = io::stdin();
    let mut handle = stdin.lock();
    handle.read_to_string(&mut prql).expect("failed to read input"); // read stdin input
    let sql = compile(&prql).expect("failed to compile"); // transform the input into something else
    println!("{}", sql);
}

It's been compiled with cargo wasi build --release and produced a binary. Validating it works as expected with wasmtime:

cat input.prql | wasmtime run target/wasm32-wasi/release/prql-wasi.wasm
SELECT
  name,
  age
FROM
  employees

I would like to provide a Go library that embeds this wasm binary. Here's what it currently looks like, with non critical sections elided:

//go:embed prql-wasi.wasm
var prqlWasm []byte

// Engine holds a reference to the wazero runtime as well as the pre-compiled wasm module
type Engine struct {
    code wazero.CompiledModule
    r    wazero.Runtime
}

func (e *Engine) Compile(ctx context.Context, inBuf io.Reader, name string) (string, error) {
    outBuf := new(bytes.Buffer)
    errBuf := new(bytes.Buffer)
    config := wazero.NewModuleConfig().
        WithStdout(outBuf).WithStderr(errBuf).WithStdin(inBuf)

    mod, err := e.r.InstantiateModule(ctx, e.code, config.WithName(name))
    if err != nil {
        if exitErr, ok := err.(*sys.ExitError); ok && exitErr.ExitCode() != 0 {
            return "", err
        } else if !ok {
            return "", &ParseError{Input: "who knows", Err: err}
        }
    }
    mod.Close(ctx)
    errStream := errBuf.String()
    if errStream != "" {
        return "", fmt.Errorf(errStream)
    }
    return outBuf.String(), nil
}

func (e *Engine) Close(ctx context.Context) {
    e.r.Close(ctx)
}

func New(ctx context.Context) *Engine {

    r := wazero.NewRuntimeWithConfig(ctx, wazero.NewRuntimeConfig())
    wasi_snapshot_preview1.MustInstantiate(ctx, r)
    code, err := r.CompileModule(ctx, prqlWasm)
    if err != nil {
        log.Panicln(err)
    }

    return &Engine{
        r:    r,
        code: code,
    }
}

which would be used like this:

func main() {
    ctx := context.Background()
    p := prql.New(ctx)
    defer p.Close(ctx)

    content, _ := io.ReadAll(os.Stdin)
    s := string(content)
    inBuf := strings.NewReader(s)
       // simulate executing a transformation multiple times
    for i := 0; i < 500; i++ {
               name := fmt.Sprintf("run-%d", i)
        _, err := p.Compile(ctx, inBuf,  name)
                inBuf.Seek(0, 0)
                // handle err…
       }

Describe the solution you'd like

It would be greatly beneficial to have documented best practices around how to optimize running a wasi module multiple times.

Questions that could benefit from being documented

What are the best practices around invoking a wasi module multiple times? What are the performance implications of using wasi vs a standard wasm module?

Is it preferable to close the module each time:

mod, _ := e.r.InstantiateModule(ctx, e.code, config)
mod.Close()

or invoking it with a different name without closing it:

_, err := e.r.InstantiateModule(ctx, e.code, config.WithName(name))
// is mod.Close() necessary if we use a different name for each invocation?

Additional context Wazero has been extremely fun to build prototypes with, and I'm curious about how to start working on production-grade code. A canonical production ready example would be a great starting point for many new projects.

codefromthecrypt commented 1 year ago

I'll give a quick answer as we've had some chats about this on #wazero-dev gophers slack.

The TL;DR; is that main is not something to re-execute. In other words, wasi guests are not safe to re-execute "_start", which is done on instantiate. Same thing applies to GOOS=js and its main. Re-instantiation is the way out, and you can control the name, or use namespace to avoid conflicts when run in parallel.

In any case, always close modules when done with them, unless you are only initing one per runtime and the runtime closes it. So, when any module is no longer needed, either due to its work being complete via init, or you no longer need its exports, close it. This advice applies to all guests, not just WASI.

If you didn't ask about wasi, and were asking about things that could be re-used (exported functions), I would suggest pooling as is done here. https://github.com/http-wasm/http-wasm-host-go/blob/main/handler/middleware.go#L151

-- more notes on wasi

We've tried it and the compilers don't expect that to run again. Many times, they will do things like "unreachable", and the behaviors of trying to execute "_start" again aren't something we can affect (as it is in how wasi-libc is used by compilers often enough.

help wanted to raise a PR in a place that makes sense weaving in this advice.

pims commented 1 year ago

Thanks for the quick response @codefromthecrypt

If you didn't ask about wasi, and were asking about things that could be re-used (exported functions), I would suggest pooling as is done here.

My question was mostly about wasi, so thanks for the clarifications.

help wanted to raise a PR in a place that makes sense weaving in this advice.

I'll give this a try once I figure out where it belongs.

shynome commented 5 months ago

I have create a golang wasm cgi server based on wazero, but performance is too bad, only have 137 qps

how to improve reuse wasi module performance?

Thank you for any reply!

goos: linux
goarch: amd64
pkg: github.com/shynome/go-wagi/z
cpu: AMD Ryzen 7 5700G with Radeon Graphics         
BenchmarkWASI-16             165           7272175 ns/op
PASS
ok      github.com/shynome/go-wagi/z    3.007s

z.go

package main

import "log"

func init() {
    log.Println("init")
}

func main() {
    log.Println("main")
}

z_test.go

package main

import (
    "context"
    _ "embed"
    "os"
    "os/exec"
    "sync"
    "testing"

    "github.com/shynome/err0/try"
    "github.com/tetratelabs/wazero"
    "github.com/tetratelabs/wazero/imports/wasi_snapshot_preview1"
)

func TestMain(m *testing.M) {
    cmd := exec.Command("go", "build", "-o", "z.wasm", ".")
    env := os.Environ()
    env = append(env, "GOOS=wasip1", "GOARCH=wasm")
    cmd.Env = append(cmd.Env, env...)
    cmd.Stdout = os.Stdout
    cmd.Stderr = os.Stderr
    try.To(cmd.Run())

    wasm := try.To1(os.ReadFile("z.wasm"))

    ctx := context.Background()
    rtc := wazero.NewRuntimeConfigCompiler()
    rt = wazero.NewRuntimeWithConfig(ctx, rtc)
    wasi_snapshot_preview1.MustInstantiate(ctx, rt)

    cm = try.To1(rt.CompileModule(ctx, wasm))

    m.Run()
}

var (
    rt wazero.Runtime
    cm wazero.CompiledModule
)

func BenchmarkWASI(b *testing.B) {
    ctx := context.Background()
    mc := wazero.NewModuleConfig().
        WithSysNanotime().
        // WithStderr(os.Stderr).
        // WithStdin(os.Stdin).
        // WithStdout(os.Stdout).
        WithName("")
    var wg sync.WaitGroup
    wg.Add(b.N)
    for range b.N {
        go func() {
            defer wg.Done()
            mc := mc.WithName("")
            mod := try.To1(rt.InstantiateModule(ctx, cm, mc))
            mod.Close(ctx)
        }()
    }
    wg.Wait()
}

I fix z_test.go benchmark, last version has some problem

ncruces commented 5 months ago

In your benchmark, you're not calling b.ResetTimer() after CompileModule, which means the 1001927653 ns (1 second) include compiling the module.

Back to your server, if you don't compile the module on your hot path (and you probably don't, if you get 100 qps), there's not much more to do.

shynome commented 5 months ago

Thanks for your tips. I have update benchmark, the new performance is 137 qps, it is too slow.

how do improve reuse wasi module performance?


there's not much more to do.

sorry I ignore this I will try improve performance, by change env variable and call _start again.

Thanks for your reply again!


golang wasip1 300 qps by wasmer wcgi, so maybe the problem is golang wasm is too big to exceute quickly

ncruces commented 5 months ago

Yes, and I don't think Go likes to have _start called again.

You should (maybe) consider TinyGo for the guest, and export functions, so you only instantiate once, and then call functions from an already instantiated module.