bytecodealliance / componentize-py

Apache License 2.0
147 stars 16 forks source link

Question: Current types limitation #47

Closed miguelaeh closed 10 months ago

miguelaeh commented 10 months ago

Hello guys,

First, this is great work, congrats!

I have been playing around and created a custom (but simple) .wit, implemented in Python a function defined in that WIT file, built the .wasm module and then I tried to call the function from a Rust application.

I was able to provide a custom record type as parameter of the function and the component built succeeded (I mean, I generated the .wasm file correctly). However, using either the wasmer or wasmtime crates, the supported set of types you can use in the call function as Value/Val is very small. Ref: https://docs.rs/wasmtime/latest/wasmtime/enum.Val.html

So my question is, am I missing something or it is actually not possible to use a custom record right now as parameter or return type of a function?

I know this is probably not related to the componentize-py tool but a more general question, but I am not sure which would be the best place to ask.

dicej commented 10 months ago

Hi, @miguelaeh. Good question. The Val type you linked to is part of Wasmtime's API for working with core Wasm modules, whereas componentize-py and WIT use the Component Model API: https://docs.rs/wasmtime/latest/wasmtime/component/index.html.

The equivalent call function and Val type in that API are https://docs.rs/wasmtime/latest/wasmtime/component/struct.Func.html#method.call and https://docs.rs/wasmtime/latest/wasmtime/component/enum.Val.html, respectively. However, you probably don't want to use either of those. Instead, you'll want to use https://docs.rs/wasmtime/latest/wasmtime/component/macro.bindgen.html, which will generate host bindings for your .wit file, and those bindings will generally be easier to use and more efficient than using the dynamic call/Val API. See e.g. https://github.com/bytecodealliance/componentize-py/blob/main/src/test/tests.rs for an example of using that macro. We should probably add a simple example to the examples directory, too, since the tests are pretty complicated.

Hope that helps. I'm happy to answer any follow-up questions here, and https://bytecodealliance.zulipchat.com/#narrow/stream/217126-wasmtime is also a great place to ask questions.

miguelaeh commented 10 months ago

Thank you very much @dicej for the clarifications! I will try with the path you mentioned and let you know.

Feel free to close this issue since is actually not an issue.

miguelaeh commented 10 months ago

Hi @dicej ,

Following the docs you shared it was really simple to create the Rust program. It is able to recognize the types generated by bindgen. However, it gets stuck forever creating the component from the WASM file generated by componentize-py and the OS kills the process.

fn main() {
    let mut config = Config::new();
    config.wasm_component_model(true);
    config.debug_info(true);
    let engine = Engine::new(&config).unwrap();
    println!("Creating component");
    let component = Component::from_file(&engine, "./process.wasm").unwrap(); <- This never ends (see below)
    println!("Component created");
    ...

After a long time, it gets killed by the OS:

➜  pipeless-wit git:(master) ✗ cargo run
   Compiling pipeless-wit v0.1.0 (/home/miguelaeh/projects/wasm-components-experiments/pipeless-wit)
    Finished dev [unoptimized + debuginfo] target(s) in 8.27s
     Running `target/debug/pipeless-wit`
Creating component
[1]    428659 killed     cargo run

For some context, this is my Python implementation:

import hook
from hook.imports.types import Frame, Context
class Hook(hook.Hook):
    def hook(self, frame: Frame, ctx: Context) -> str:
        return "Hello, World!"

and the WIT:

package pipeless:hooks;

interface types {
  record frame {
    uuid: string,
    original: list<list<list<u8>>>,
    modified: list<list<list<u8>>>,
    width: u32,
    height: u32,
    pts: u64,
    dts: u64,
    duration: u64,
    fps: u8,
    input-timestamp: float64,
    inference-input: list<list<list<u8>>>,
    inference-output: list<list<list<u8>>>,
    pipeline-id: string,
  }

  record context {
    to-change: string,
  }
}

world hook {
    use types.{frame, context};
    export hook: func(f: frame, c: context) -> string;
}

It is my first time with the component model so I may be missing something trivial here.

dicej commented 10 months ago

Hi @miguelaeh,

I just pushed a complete test case which works for me here: https://github.com/dicej/pipeless.

Note that it is important to build the host with the --release flag, since the default debug build is much slower. Cranelift (which is the compiler Wasmtime uses to compile Wasm to native code) has a lot of work to do with components generated by componentize-py since they include the entire CPython interpreter and native library code. Even with --release compiling the component takes over 5 seconds on my Mac M2 Pro, and the debug build takes a lot longer.

For reference, you can pre-compile a .wasm file to a native .cwasm file using wasmtime compile if performance is an issue. You can also use instantiate_pre instead of instantiate if you find you need to make a lot of instantiations of the same component.

dicej commented 10 months ago

One more thing: I noticed you mentioned that the OS is killing your process before it finishes compiling the component. That might mean it's running out of memory, which --release might help with. If that doesn't help, you might need to use wasmtime compile to generate a .cwasm on a machine with more memory.

miguelaeh commented 10 months ago

Thank you very much @dicej! I really appreciate the detailed explanation and you taking the time to create the complete test case.

Now it is working. After all, it was not an issue with the code but with my system. For some reason I had a big amount of leaked memory that was producing the system running out of memory when compiling the component. After rebooting it works. I will also take a look to the wasmtime compile command and the instantiate_pre, since looks like something I will need for what I am doing.

Again, thank you very much and continue the good work! I am really excited about using componentize-py and the component model to run Pipeless hooks. For some context, we are using right now Pyo3 there to run Python hooks from Rust every time there is a new frame.