I am doing some tests of Barracuda on computer vision tasks and I am currently struggling with the memory management.
I generated two ONNX from pytorch with a single convolutional layer (batch size 1 and batch size 20, width 640, height 480, channels 3).
1) Why am I able to run both on a generated tensor with any batch size? I would expect to be able to run each model only with the correct batch size? I can also run them on images with different sizes (e.g. 1280x960).
2) Running on Android (Samsung S22 Ultra exynos): If I run the model multiple time sequentially with batch size 1 I encouter no memory issues, but if I run both models on an input tensor with batch size 20, the memory consumption goes up until it crashes (even the model which is designed to run on batch size 20). Am I not managing the memory correctly? Attached the onnx models (with a .txt appendix to the file name) and a screen of the memory read by the Unity inspector.
The interesting part of the code is the following:
void Start() {
_runtimeModel = ModelLoader.Load(modelAsset);
// ? Create a model builder to modify the m_RunTimeModel
ModelBuilder modelBuilder = new (_runtimeModel);
// var workerType = WorkerFactory.Type.CSharpBurst; // CPU
var workerType = WorkerFactory.Type.ComputePrecompiled; // GPU
_worker = WorkerFactory.CreateWorker(workerType, modelBuilder.model);
}
public void RunNetwork() {
Tensor inputTensor = new(_batchSize, 480, 640, 3);
Debug.Log($"inputTensor: {inputTensor.shape}");
Debug.Log("running network");
_worker.Execute(inputTensor);
// Tensor outputTensor = _worker.PeekOutput("output");
Tensor outputTensor = _worker.CopyOutput("output");
inputTensor.Dispose();
// inputTensor.FlushCache(false);
outputTensor.Dispose();
// outputTensor.FlushCache(false);
}
public void OnDestroy()
{
_worker?.Dispose();
}
void OnDisable()
{
_worker?.Dispose();
}
Barracuda version 3.0 Unity version 2022.2.14f1
Hi all,
I am doing some tests of Barracuda on computer vision tasks and I am currently struggling with the memory management.
I generated two ONNX from pytorch with a single convolutional layer (batch size 1 and batch size 20, width 640, height 480, channels 3).
1) Why am I able to run both on a generated tensor with any batch size? I would expect to be able to run each model only with the correct batch size? I can also run them on images with different sizes (e.g. 1280x960). 2) Running on Android (Samsung S22 Ultra exynos): If I run the model multiple time sequentially with batch size 1 I encouter no memory issues, but if I run both models on an input tensor with batch size 20, the memory consumption goes up until it crashes (even the model which is designed to run on batch size 20). Am I not managing the memory correctly? Attached the onnx models (with a .txt appendix to the file name) and a screen of the memory read by the Unity inspector.
The interesting part of the code is the following:
bs20_single_layer_640.onnx.txt single_layer_cnn_640.onnx.txt