espressif / esp-dl

Espressif deep-learning library for AIoT applications
MIT License
516 stars 115 forks source link

Question about memory usage of models that are build with the ESP-DL Quantization Tool (AIV-686) #158

Open MichelBerg opened 3 months ago

MichelBerg commented 3 months ago

I am currently working on a comparison between TFLM (TensorFlowLiteMicro) and ESP-DL. Besides speed I want to compare the memory consumption. In TFLM we have to create a Tensor Arena by using a specific tensorArenaSize, but in ESP-DL we do not make such specifications.

My Question is how to estimate the memory usage of ESP-DL Quantization Tool Models? I mean we define layer per layer and use the call()/forward()-function. If I got it right, the memory of each previous layer gets freed, after calculation as it is shown in this example.

From the example ESP-DL Quantization Tool Tutorial:

class MNIST : public Model<int16_t>
{
    // ellipsis member variables
    // ellipsis constructor function
    // ellipsis build(...)

    void call(Tensor<int16_t> &input)
    {
        this->l1.call(input);
        input.free_element();

        this->l2.call(this->l1.get_output());
        this->l1.get_output().free_element();

        this->l3.call(this->l2.get_output());
        this->l2.get_output().free_element();

        this->l4.call(this->l3.get_output());
        this->l3.get_output().free_element();
    }
};

Is it possible to say, we have modelinput, output and the layer with the biggest amount of parameters as peak memory usage?

Here is an example that I am thinking of:

Data: Image Data RGB 96x96 Pixel Modeltype: int16 quantized

Implementation of my call(): (which is basicaly the same as from the example)

  void call(Tensor<int16_t> &input)
    {
        this->l1.call(input);
        input.free_element();
        this->l2.call(this->l1.get_output());
        this->l1.get_output().free_element();
        this->l3.call(this->l2.get_output());
        this->l2.get_output().free_element();
        this->l4.call(this->l3.get_output());
        this->l3.get_output().free_element();
        this->l5.call(this->l4.get_output());
        this->l4.get_output().free_element();
        this->l6.call(this->l5.get_output());
        this->l5.get_output().free_element();
        this->l7.call(this->l6.get_output());
        this->l6.get_output().free_element()
    }

Model summary: (here i estimate dense_4 as the biggest layer) image

Calcluation approach for peak memory usage): Inputdata: 1x96x96x3 =>27648 Values => 2 (due to int16 modeltype) => Total: 55296 Bytes Fullyconected Layer: with 100384 Parameters => 2 (due to int16 modeltype) => Total: 200768 Bytes Outputdata: 1x6 => 6 Values (6 Classes) => *4 (due to float output of a softmax layer) => Total: 24 Bytes

Makes a total of: 256,088 KB?

Is this a valid approach?

sun-xiangyu commented 3 months ago

Hi @MichelBerg , you can use heap_caps_get_free_size function to get free memory size.

heap_caps_get_free_size(MALLOC_CAP_8BIT): return all memory including PSRAM and Internal RAM heap_caps_get_free_size(MALLOC_CAP_INTERNAL) return internal RAM