esp32S3 tflite-microcontroller doesn't support SHAPE (不支持SHAPE算子) (TFMIC-31)

Criminal-9527 commented 4 months ago

#include "main_functions.h"
#include "app_camera_esp.h"

#include "tensorflow/lite/micro/micro_interpreter.h"
#include "tensorflow/lite/micro/micro_log.h"
#include "tensorflow/lite/micro/micro_mutable_op_resolver.h"
#include "tensorflow/lite/schema/schema_generated.h"
#include "model.h"

#include "freertos/FreeRTOS.h"
#include "freertos/task.h"
#include "freertos/semphr.h"

#include <esp_heap_caps.h>
#include <esp_timer.h>
#include <esp_log.h>
#include "esp_lcd_panel_ops.h"
#include "esp_lcd_panel_rgb.h"
#include "esp_painter.h"
#include "bsp/esp-bsp.h"
#include "bsp/display.h"

#ifdef __cplusplus
extern "C" {
#endif

extern esp_painter_handle_t painter;
extern esp_lcd_panel_handle_t lcd_panel;

#ifdef __cplusplus
}
#endif

extern SemaphoreHandle_t sem_usb_done;
extern SemaphoreHandle_t sem_xfer_done;

extern uint8_t *lcd_frame_buf[CONFIG_BSP_LCD_RGB_BUFFER_NUMS];
static uint8_t handle_buf_index = 0; //图像处理缓冲区索引

namespace {
    const tflite::Model* model = nullptr;
    tflite::MicroInterpreter* interpreter = nullptr;
    TfLiteTensor* input = nullptr;
    TfLiteTensor* output = nullptr;

    constexpr int kTensorArenaSize = 2 * 1024;
    uint8_t tensor_arena[kTensorArenaSize];
}

void setup() {
    model = tflite::GetModel(g_model);
    if (model->version() != TFLITE_SCHEMA_VERSION) {
        MicroPrintf("Model provided is schema version %d not equal to supported "
                    "version %d.", model->version(), TFLITE_SCHEMA_VERSION);
        return;
    }

static tflite::MicroMutableOpResolver<1> resolver;
    if (resolver.AddFullyConnected() != kTfLiteOk)
        return;

    static tflite::MicroInterpreter static_interpreter(model, resolver, tensor_arena, kTensorArenaSize);
    interpreter = &static_interpreter;

    TfLiteStatus allocate_status = interpreter->AllocateTensors();
    if (allocate_status != kTfLiteOk) {
        MicroPrintf("AllocateTensors() failed");
        return;
    }

    input = interpreter->input(0);
    output = interpreter->output(0);
}

void loop() {
    uint8_t* image = lcd_frame_buf[handle_buf_index];
    if(image != NULL){
        input -> data.uint8 = image;

        const int input_dims[] = {1, 640, 480, 3}; // 批量大小, 高度, 宽度, 通道数

        input -> dims = (TfLiteIntArray*)input_dims;

        TfLiteStatus invoke_status = interpreter->Invoke();
        if (invoke_status != kTfLiteOk) {
            MicroPrintf("Invoke failed");
            return;
        }

        float landmarks[21][2];
        uint16_t num_landmarks = output->dims->data[0] / 3; // 每个关键点有3个值（x,y,z）

        // 遍历所有关键点
        for (int i = 0; i < num_landmarks; ++i) {
            // 从输出张量中获取关键点的x和y坐标
            // 假设关键点坐标是连续存储的
            float x = output->data.f[i * 2];
            float y = output->data.f[i * 2 + 1];

            // 存储关键点坐标到landmarks数组(只要x和y)
            landmarks[i][0] = x;
            landmarks[i][1] = y;
        }

        // 假设我们有一个RGB565格式的单个像素颜色
        uint32_t pixel_color = 0xF800; // 红色，16位RGB565格式

        for (int i = 0; i < 21; ++i) {
            uint16_t screen_x = landmarks[i][0] * 640;
            uint16_t screen_y = landmarks[i][1] * 480;
        }

        handle_buf_index = (handle_buf_index + 1) == CONFIG_BSP_LCD_RGB_BUFFER_NUMS ? 0 : (handle_buf_index + 1);

        // 将解码后的图像绘制到LCD面板上
        esp_lcd_panel_draw_bitmap(lcd_panel, 0, 0, 640, 480, lcd_frame_buf[handle_buf_index]);
    }
}

报错为Didn't find op for builtin opcode “SHAPE"，我查询过相关库代码，发现这里的报错是其遍历支持的操作列表之后并没有找到SHAPE的操作，导致 interpreter->AllocateTensors(); 这里失败。

TfLiteStatus GetRegistrationFromOpCode(const OperatorCode* opcode,
                                       const MicroOpResolver& op_resolver,
                                       const TFLMRegistration** registration) {
  TfLiteStatus status = kTfLiteOk;
  *registration = nullptr;
  auto builtin_code = GetBuiltinCode(opcode);

  if (builtin_code > BuiltinOperator_MAX) {
    MicroPrintf("Op builtin_code out of range: %d.", builtin_code);
    status = kTfLiteError;
  } else if (builtin_code != BuiltinOperator_CUSTOM) {
    *registration = op_resolver.FindOp(builtin_code);
    if (*registration == nullptr) {
      MicroPrintf("Didn't find op for builtin opcode '%s'",
                  EnumNameBuiltinOperator(builtin_code));
      status = kTfLiteError;
    }
……

在最后一行跑到了error,findop为查找操作表的操作。我尝试过在下面这里加入AddShape操作（这个函数在库文件里面是真实存在的）

static tflite::MicroMutableOpResolver<1> resolver;
    if (resolver.AddFullyConnected() != kTfLiteOk)
        return;

但是最后的报错是他认为我这个是自定义操作

Criminal-9527 commented 4 months ago

抱歉，使用github次数不多，不太熟练基本的操作，问题描述看起来有点难受，感谢帮忙

vikramdattu commented 4 months ago

Hi @Criminal-9527,

The Shape OP is supported. In fact, all the OPs supported by upstream tflite-micro are supported.

Please add all the OPs in the last piece of code:

static tflite::MicroMutableOpResolver<1> resolver; // The number inside <> should be equal to the number of OPs being added.
    if (resolver.AddFullyConnected() != kTfLiteOk)
        return;

Are you sure you increased the number of OPs in the declaration of resolver?
Try increasing the Arena Size, constexpr int kTensorArenaSize = 2 * 1024;

If you have confirmed all these, Can you share the logs you see? Sharing a smallest possible example to reproduce the issue would be nice. That should include few layers of model embedded as C as well.

Criminal-9527 commented 4 months ago

@vikramdattu 我尝试写了一个最简单的示例，the arena size我设置为20 * 1024的大小

#include "tensorflow/lite/c/common.h"
#include "tensorflow/lite/micro/micro_interpreter.h"
#include "tensorflow/lite/micro/micro_log.h"
#include "tensorflow/lite/micro/micro_mutable_op_resolver.h"
#include "tensorflow/lite/schema/schema_generated.h"
#include "tensor_Model.h"
#include "model.h"

namespace {
    const tflite::Model* model = nullptr;
    tflite::MicroInterpreter* interpreter = nullptr;
    TfLiteTensor* input = nullptr;
    TfLiteTensor* output = nullptr;

    constexpr int kTensorArenaSize = 20 * 1024;
    uint8_t tensor_arena[kTensorArenaSize];
}

void setup() {
    model = tflite::GetModel(g_model);
    if (model->version() != TFLITE_SCHEMA_VERSION) {
        MicroPrintf("Model provided is schema version %d not equal to supported "
                    "version %d.", model->version(), TFLITE_SCHEMA_VERSION);
        return;
    }

    static tflite::MicroMutableOpResolver<1> resolver;
    if (resolver.AddFullyConnected() != kTfLiteOk)
        return;

    static tflite::MicroInterpreter static_interpreter(model, resolver, tensor_arena, kTensorArenaSize);
    interpreter = &static_interpreter;

    TfLiteStatus allocate_status = interpreter->AllocateTensors();
    if (allocate_status != kTfLiteOk) {
        MicroPrintf("AllocateTensors() failed");
        return;
    }

    input = interpreter->input(0);
    output = interpreter->output(0);
}

void loop() {

}

使用setup和loop的格式是因为我需要在项目后期将代码转型arduino,这个我在main函数中操作过，不必担心。下面是日志

ESP-ROM:esp32s3-20210327
Build:Mar 27 2021
rst:0x1 (POWERON),boot:0x8 (SPI_FAST_FLASH_BOOT)
SPIWP:0xee
mode:DIO, clock div:1
load:0x3fce3820,len:0x1918
load:0x403c9700,len:0x4
load:0x403c9704,len:0xe5c
load:0x403cc700,len:0x3028
entry 0x403c993c
I (27) boot: ESP-IDF v5.2.2-dirty 2nd stage bootloader
I (27) boot: compile time Jul  8 2024 14:37:31
I (27) boot: Multicore bootloader
I (30) boot: chip revision: v0.2
I (34) qio_mode: Enabling QIO for flash chip GD
I (39) boot.esp32s3: Boot SPI Speed : 80MHz
I (44) boot.esp32s3: SPI Mode       : QIO
I (49) boot.esp32s3: SPI Flash Size : 2MB
I (54) boot: Enabling RNG early entropy source...
I (59) boot: Partition Table:
I (63) boot: ## Label            Usage          Type ST Offset   Length
I (70) boot:  0 nvs              WiFi data        01 02 00009000 00006000
I (77) boot:  1 phy_init         RF data          01 01 0000f000 00001000
I (85) boot:  2 factory          factory app      00 00 00010000 00100000
I (92) boot: End of partition table
I (97) esp_image: segment 0: paddr=00010020 vaddr=3c020020 size=1ae20h (110112) map
I (122) esp_image: segment 1: paddr=0002ae48 vaddr=3fc91700 size=02808h ( 10248) load
I (124) esp_image: segment 2: paddr=0002d658 vaddr=40374000 size=029c0h ( 10688) load
I (130) esp_image: segment 3: paddr=00030020 vaddr=42000020 size=1f698h (128664) map
I (155) esp_image: segment 4: paddr=0004f6c0 vaddr=403769c0 size=0ace0h ( 44256) load
I (170) boot: Loaded app from partition at offset 0x10000
I (170) boot: Disabling RNG early entropy source...
I (181) cpu_start: Multicore app
I (191) cpu_start: Pro cpu start user code
I (191) cpu_start: cpu freq: 160000000 Hz
I (191) cpu_start: Application information:
I (194) cpu_start: Project name:     tflite_no_micro
I (199) cpu_start: App version:      v5.2.2-dirty
I (205) cpu_start: Compile time:     Jul  8 2024 14:50:48
I (211) cpu_start: ELF file SHA256:  7aa892087...
I (216) cpu_start: ESP-IDF:          v5.2.2-dirty
I (221) cpu_start: Min chip rev:     v0.0
I (226) cpu_start: Max chip rev:     v0.99
I (231) cpu_start: Chip rev:         v0.2
I (236) heap_init: Initializing. RAM available for dynamic allocation:
I (243) heap_init: At 3FC99968 len 0004FDA8 (319 KiB): RAM
I (249) heap_init: At 3FCE9710 len 00005724 (21 KiB): RAM
I (255) heap_init: At 3FCF0000 len 00008000 (32 KiB): DRAM
I (261) heap_init: At 600FE010 len 00001FD8 (7 KiB): RTCRAM
I (268) spi_flash: detected chip: gd
I (272) spi_flash: flash io: qio
W (276) spi_flash: Detected size(16384k) larger than the size in the binary image header(2048k). Using the size in the binary image header.
I (289) sleep: Configure to isolate all GPIO pins in sleep state
I (296) sleep: Enable automatic switching of GPIO sleep configuration
I (303) main_task: Started on CPU0
I (313) main_task: Calling app_main()
Didn't find op for builtin opcode 'SHAPE'
Failed to get registration from op code SHAPE

AllocateTensors() failed

还有就是关于这里

static tflite::MicroMutableOpResolver<1> resolver; // The number inside <> should be equal to the number of OPs being added.
if (resolver.AddFullyConnected() != kTfLiteOk)
    return;

我已尝试过多种组合，包括增加<>里的数字到很大，手动加入shape操作，都未成功，其中下面这种我试过：

static tflite::MicroMutableOpResolver<2> resolver;
if (resolver.AddShape() != kTfLiteOk)
    return;
if (resolver.AddFullyConnected() != kTfLiteOk)
    return;

报错为： Failed to get registration from op code CUSTOM AllocateTensors() failed

vikramdattu commented 4 months ago

Okay, looks like this is working. The CUSTOM issue appears because, your model is using some OP which is not supported by tflite-micro. Can you make sure that particular OP is not used?

Criminal-9527 commented 4 months ago

@vikramdattu 我使用了netron查看了模型的结构，部分结构如下：我可以减去这个操作吗(我自己不太确定) 如果必须要使用SHAPE操作的话，能不能使用标准的tflite而不用tflite-micro（我的板子是esp32-s3-lcd-ev-board2 v 1.5的，内存只有32M）

vikramdattu commented 4 months ago

Hi @Criminal-9527 the FlexTensorListReverse and FlexTensorListStack from these are not supported by tflite-micro.
Did you quantise the code and converted it to int8 model?

If you want to use these OPs, you will need to implement these yourself and use those. reference

espressif / esp-tflite-micro

esp32S3 tflite-microcontroller doesn't support SHAPE (不支持SHAPE算子) (TFMIC-31) #88