WebAssembly / wasi-nn

Neural Network proposal for WASI
458 stars 35 forks source link

Question: Compatibility with XGBoost or scikit-learn models #58

Open chongshenng opened 11 months ago

chongshenng commented 11 months ago

Hello! Firstly, thanks for the amazing work on enabling WebAssembly for ML 🤗!

I'm fairly new to adopting WebAssembly for machine learning, but I'm particularly interested to compile non-neural network type models to WebAssembly for inference. Can I clarify that the following backend support for wasi-nn:

Tensorflow, ONNX, OpenVINO, etc.

specifically for ONNX, imply that wasi-nn should also work with XGBoost and scikit-learn models/pipelines, for example, in this documentation? The idea here is that if I have an existing XGBoost model that I'd like to deploy as inference, I would need to first convert the model to an ONNX format and then write the wasi-nn bindings (AssemblyScript or Rust bindings) to execute the model?

Thanks in advance!

abrown commented 11 months ago

@chongshenng, sounds like there are several options here:

Once you get the above piece figured out ("what engine and backend will run this model?"), then you can write a program in some language, compile it to WebAssembly, and execute the model. Here's a good example of that using Wasmtime and OpenVINO: main.rs. Notice how, since we're using Rust and we have created some Rust bindings already (the wasi-nn crate), you won't need to create any additional bindings yourself — just use the crate and compile to the wasi32-wasi target. If you use some other language, e.g., C, you would have to create the bindings yourself. Again, this is not too difficult but you may not be interested in this part.

Some more details on bindings (feel free to ignore if this is too much!): because this wasi-nn specification has switched to using the WIT language, the bindings could be auto-generated for you by wit-bindgen. This allows you to use more languages, but note that not all engines support the component model (i.e., the ABI for WIT) yet, so for those engines this path is not helpful.

To sum it up: (1) decide which ML backend to use (ONNX?), (2) decide which engine to use and make sure it supports the backend, and (3) compile your code to WebAssembly and run it in the engine. Hope that helps!

chongshenng commented 11 months ago

@abrown, thank you for the clear explanations! I appreciate the details, it helps my understanding as well.

I would very much be interested to learn/help with running ONNX models. Let me reach out to @devigned separately to understand what currently exists.

Am I correct in understanding that [tract](https://github.com/sonos/tract) (specifically tract_onnx) is another option to run ONNX in WebAssembly? How is wasi-nn different with tract_onnx?

abrown commented 11 months ago

I think what tract is trying to do is compile the ML backend itself — the implementations of all the operators — into a WebAssembly module. (That is if I remember correctly how they use WebAssembly...). I found an example where they explain this a bit. The major difference with wasi-nn is that wasi-nn delegates to an ML backend outside the WebAssembly sandbox that can use any special HW features the system has available (threads, wider SIMD, special instructions, etc.). From measurements @mingqiusun and I did some time ago there is a large-ish performance gap between using an ML backend outside (wasi-nn) versus using one inside the sandbox. But the inside approach is more portable.