Closed panickal-xmos closed 3 months ago
This is a great addition to the examples, and another good approach to try for optimizations, but this is not the scenario I had in mind in #910.
What I meant with splitting the models was actually splitting the model graph into two smaller models, each with its own weights and arenas. Each tile would then run a model, and the output of one would be the input of the other.
So if I had a model.tflite with nodes A -> B -> C -> D
, we would split it into model_1.tflite with A -> B
in it, and model_2.tflite with C -> D
in it.
We would compile each model separately, generating model_1.tflite.cpp/.h
out of model_1.tflite
and model_2.tflite.cpp/.h
out of model_2.tflite
. tile[0] would run model_1 and tile[1] would run model_2. Lastly we would somehow use a channel to connect B
with C
, effectively reconstructing the original model but using both tiles.
The models we have right now have more than 1 MB in weights, so it wouldn't be feasible to place the weights in tile[0] for our current models. However, we might explore different architectures that require fewer parameters, which would allow us to do try this out, so in the end this example is still very valuable :) Thank you!
Closes https://github.com/xmos/ai_tools/issues/910