Open Michael-F-Bryan opened 2 years ago
@Michael-F-Bryan long term maintainability will be more problematic though. tract does NOT implement all the operators that tf-lite / ONNX provides. Even ONNX support is not 100%, and this is a moving target. So whenever a user's model doesn't work, we get the bug reports (and maintainability burden) instead of upstream tensorflow/onnx. Tract's statement on tensorflow 2 support is basically:
Addiotionaly, the complexity of TensorFlow 2 make it very unlikely that a direct support will ever exist in tract. Many TensorFlow 2 nets can be converted to ONNX and loaded in tract.
So we'd be going backwards on the actual user facing features (tf 1.0, not complete onnx feature set etc..)that we support this way.
That being said, tract could make a good starting point for us to try out wasi-nn. Especially if we want to target microcontroller world (librunecoral is a no-go for that). Eventually I'd like even librunecoral to support wasi-nn, but let's see how much time / resources we can allocate for that. We still have to kill the old C++ based RuneVM.
Personally, as long as we get zero copy pipelines, and be able to use appropriate hardware acceleration for the use cases that we wish support (eg. if we want to use rune for some kind of video processing - we need tpu/gpu acceleration there, but for just text/audio based models, we can get away without using hardware acceleration), we can get away with any framework.
I think almost everyone on the HOTG team has expressed a desire to use more ML frameworks at some point, in particular ONNX and Tensor Flow. However, I was reluctant to use bindings that go through their official C++ implementations after seeing how much trouble we had integrating TensorFlow Lite.
When I was playing around with
hotg-ai/wasi-nn-experiment
I came across a pure Rust implementation TensorFlow and ONNX inference calledtract
. This was able to cross-compile toaarch64-linux-android
andwasm32-unknown-unknown
without any extra work.By using
tract
instead of the reference implementations we'll be giving up some performance, reliability, and features (e.g. missing model ops) in exchange for long term maintainability and reduced build complexity. @f0rodo may want to comment on this trade-off, but from an engineering perspective I think it's worth it.The things we'll need to support new model types:
args
field to models inside the Runefile (done)format
argument which is either"tensorflow-lite"
,"tensorflow"
, or"onnx"
to specify what type of model this is (default is"tensorflow-lite"
if not provided) (example)format
into amimetype
that gets embedded in the Rune and passed to the runtime when loading a model (conversion, injecting into the generated Rune)ModelFactory
implementations for handling TensorFlow and ONNX modelsBaseImage::with_defaults()
(maybe hide them behind a feature flag like we did with"tensorflow-lite"
so users can cut down on dependencies, it's up to you)