Open abrown opened 1 year ago
Now that we have WIT, we no longer need to allocate exact size buffer to contain output tensors.
I suspect that in practice and for initial use cases users of wasi-nn will know all this information already, but this functionality would be great if you need to build an application that surfaces information about a bunch of models in a dynamic model repository or if you are debugging during development (though that should really be handled by a descriptive error type).
As has been discussed elsewhere, it can be quite convenient to be able to describe a graph's input and output tensor dimensions and type. For example, a user of wasi-nn currently has to know the exact specifications of the ML graph being used at compile-time. This is inconvenient if one wants to dynamically specify the model. Also, users of wasi-nn in languages like Rust must allocate the exact-size buffers to contain the output tensors. These issues can be resolved by allowing users to inspect some details of the graph's inputs and outputs.
I believe wasi-nn could be improved by adding the following WIT:
One additional refactoring this suggests is to use
tensor-description
in thetensor
record (see here). With these additional functions in the specification, we could properly create high-level bindings as is suggested here: https://github.com/bytecodealliance/wasi-nn/issues/68.