Open abrown opened 3 years ago
The initial framework is being set up by https://github.com/bytecodealliance/wasi-nn/pull/48. In the long run we can create a lib file to contain the image_to_tensor function, which can then be imported by Rust and AssemblyScript apps that use wasi-nn.
An issue we ran into with Rune is that there are multiple layouts for an image depending on the way it is intended to be used and data locality.
For example, we've found that TensorFlow models prefer tend to prefer images to be [height, width, channels]
while PyTorch prefers [channels, height, width]
. The former is more convenient/performant when you are working on all pixels at the same time, while the latter lets you have all the pixels for the same channel stored sequentially (e.g. because you want to use different parameters when normalizing red pixels as opposed to blue).
@Michael-F-Bryan so this means that we may need to have multiple image to tensor implementation, depending on what underlying framework is used?
Yep, so you might have a image_to_tensor_with_channels_width_height()
function and a image_to_tensor_with_width_height_channels()
function, and so on. Preferably with more concise names.
@geekbeast, commenting here to keep related topics in this issue: @brianjjones is actively working on this issue (#70, #72, #73) and I'm sure would be happy to discuss any parts that aren't done yet. The crate is up, image2tensor
, but I believe there are some limitations about image precision that might need to be addressed?
@brianjjones, what else do you think is needed before closing this issue?
For ease-of-use, users will likely want to decode images online rather than offline like we do currently with the openvino-tensor-converter tool. Here are some possible libraries to investigate: