bytecodealliance / wasmtime

A fast and secure runtime for WebAssembly
https://wasmtime.dev/
Apache License 2.0
15.46k stars 1.31k forks source link

Support additional Execution Providers in ONNX `wasi-nn` backend #8547

Open kaivol opened 7 months ago

kaivol commented 7 months ago

Feature

Currently the ONNX backend in wasmtime-wasi-nn only uses the default CPU execution provider and ignores the ExecutionTarget requested by the WASM caller. https://github.com/bytecodealliance/wasmtime/blob/24c1388cd74ab321d60af147fc074d12166258fd/crates/wasi-nn/src/backend/onnxruntime.rs#L21-L33

I would like to suggest adding support for additional execution providers (CUDA, TensorRT, ROCm, ...) to wasmtime-wasi-nn.

Benefit

Improved performance for WASM modules using the wasi-nn API.

Implementation

ort already has support for many execution providers, so integrating these into wasmtime-wasi-nn should not be to much work. I would be interested in looking into this, however, I only really have the means to test the DirectML and NVIDIA CUDA / TensorRT EPs.

Alternatives

Leave it to the users to add support for additional execution providers.

abrown commented 5 months ago

I was looking at old issues and ran across this one (sorry for such a late reply!): I completely agree with this idea. I am tempted to say "go for it!" but maybe there is some coordination needed. E.g., I think @jianjunz has started enabling some DirectML bits in #8756. And @devigned may have some opinions on the best way to do this. But from my perspective, this seems like a worthwhile avenue to pursue.

devigned commented 5 months ago

I think this is a great idea! One interesting part will be testing. We may need to spin up some hardware to make sure the functionality stays evergreen.