Unity-Technologies / barracuda-release

Other
564 stars 76 forks source link

Out of memory after running multiple time the network, even disposing all the tensors #323

Open santelelle opened 1 year ago

santelelle commented 1 year ago

Barracuda version 3.0 Unity version 2022.2.14f1

Hi all,

I am doing some tests of Barracuda on computer vision tasks and I am currently struggling with the memory management.

I generated two ONNX from pytorch with a single convolutional layer (batch size 1 and batch size 20, width 640, height 480, channels 3).

1) Why am I able to run both on a generated tensor with any batch size? I would expect to be able to run each model only with the correct batch size? I can also run them on images with different sizes (e.g. 1280x960). 2) Running on Android (Samsung S22 Ultra exynos): If I run the model multiple time sequentially with batch size 1 I encouter no memory issues, but if I run both models on an input tensor with batch size 20, the memory consumption goes up until it crashes (even the model which is designed to run on batch size 20). Am I not managing the memory correctly? Attached the onnx models (with a .txt appendix to the file name) and a screen of the memory read by the Unity inspector.

The interesting part of the code is the following:

void Start() {
        _runtimeModel = ModelLoader.Load(modelAsset);
        // ? Create a model builder to modify the m_RunTimeModel
        ModelBuilder modelBuilder = new (_runtimeModel);
        // var workerType = WorkerFactory.Type.CSharpBurst;  // CPU
        var workerType = WorkerFactory.Type.ComputePrecompiled; // GPU
        _worker = WorkerFactory.CreateWorker(workerType, modelBuilder.model); 
 }

public void RunNetwork() {
        Tensor inputTensor = new(_batchSize, 480, 640, 3); 
        Debug.Log($"inputTensor: {inputTensor.shape}");

        Debug.Log("running network");
        _worker.Execute(inputTensor);
        // Tensor outputTensor = _worker.PeekOutput("output");
        Tensor outputTensor = _worker.CopyOutput("output");

        inputTensor.Dispose();
        // inputTensor.FlushCache(false);
        outputTensor.Dispose();
        // outputTensor.FlushCache(false);
 }

public void OnDestroy()
{
    _worker?.Dispose();
}

void OnDisable()
{
    _worker?.Dispose();
}

image bs20_single_layer_640.onnx.txt single_layer_cnn_640.onnx.txt