Open kaivol opened 7 months ago
I was looking at old issues and ran across this one (sorry for such a late reply!): I completely agree with this idea. I am tempted to say "go for it!" but maybe there is some coordination needed. E.g., I think @jianjunz has started enabling some DirectML bits in #8756. And @devigned may have some opinions on the best way to do this. But from my perspective, this seems like a worthwhile avenue to pursue.
I think this is a great idea! One interesting part will be testing. We may need to spin up some hardware to make sure the functionality stays evergreen.
Feature
Currently the ONNX backend in
wasmtime-wasi-nn
only uses the default CPU execution provider and ignores theExecutionTarget
requested by the WASM caller. https://github.com/bytecodealliance/wasmtime/blob/24c1388cd74ab321d60af147fc074d12166258fd/crates/wasi-nn/src/backend/onnxruntime.rs#L21-L33I would like to suggest adding support for additional execution providers (CUDA, TensorRT, ROCm, ...) to
wasmtime-wasi-nn
.Benefit
Improved performance for WASM modules using the
wasi-nn
API.Implementation
ort
already has support for many execution providers, so integrating these intowasmtime-wasi-nn
should not be to much work. I would be interested in looking into this, however, I only really have the means to test the DirectML and NVIDIA CUDA / TensorRT EPs.Alternatives
Leave it to the users to add support for additional execution providers.