espressif / esp-tflite-micro

TensorFlow Lite Micro for Espressif Chipsets
Apache License 2.0
392 stars 83 forks source link

Run the function `interpreter->AllocateTensors()` unsuccess #30

Open jiwenfei opened 1 year ago

jiwenfei commented 1 year ago

I want to launch the movenet model at my board esp32s3 korvo-2.

And i firstly change the pretrained ftfile to cc file by using the code

MODEL_TFLITE='/content/model.tflite'
MODEL_TFLITE_MICRO='/content/movenetmodel.cc'
# Install xxd if it is not available
# !apt-get update && apt-get -qq install xxd
# Convert to a C source file, i.e, a TensorFlow Lite for Microcontrollers model
!xxd -i {MODEL_TFLITE} > {MODEL_TFLITE_MICRO}
# Update variable names
REPLACE_TEXT = MODEL_TFLITE.replace('/', '_').replace('.', '_')
!sed -i 's/'{REPLACE_TEXT}'/g_model/g' {MODEL_TFLITE_MICRO}

and my code is:

extern "C" void app_main() {
  printf("start !!\n");

  const tflite::Model* model = ::tflite::GetModel(g_modelerererer);
  if (model->version() != TFLITE_SCHEMA_VERSION) {
      printf("load model error!\n");
  }
  printf("model load success !!\n");
  tflite::AllOpsResolver resolver;
  const int tensor_arena_size = 2028784;
  static uint8_t *tensor_arena;
    if (tensor_arena == NULL) {
    tensor_arena = (uint8_t *) heap_caps_malloc(tensor_arena_size, MALLOC_CAP_SPIRAM);
  }
  if (tensor_arena == NULL) {
    printf("Couldn't allocate memory of %d bytes\n", tensor_arena_size);
    return;
  }
  printf("array init success !!\n");
  tflite::MicroInterpreter* interpreter = nullptr;
  static tflite::MicroInterpreter static_interpreter(model, resolver, tensor_arena,
                                     tensor_arena_size);
  interpreter = &static_interpreter;
  printf("interpreter init success !!\n");
  TfLiteStatus allocate_status = interpreter->AllocateTensors();
  if (allocate_status != kTfLiteOk) {
    MicroPrintf("AllocateTensors() failed");
    return;
  }
  printf("tensor init success !!\n");
  TfLiteTensor* input = interpreter->input(0);
  printf("input->dims->size=%d,input->dims->data={%d,%d}\n",input->dims->size,
        input->dims->data[0],input->dims->data[1]);
  printf("check success!\n");

}

and monitor show

I (0) cpu_start: App cpu up.
I (1050) spiram: SPI SRAM memory test OK
I (1059) cpu_start: Pro cpu start user code
I (1059) cpu_start: cpu freq: 240000000
I (1059) cpu_start: Application information:
I (1062) cpu_start: Project name:     person_detection
I (1068) cpu_start: App version:      72f2c99-dirty
I (1073) cpu_start: Compile time:     Jan 31 2023 10:52:09
I (1079) cpu_start: ELF file SHA256:  d6d49cefd89cf54f...
I (1086) cpu_start: ESP-IDF:          v4.4.3-dirty
I (1091) heap_init: Initializing. RAM available for dynamic allocation:
I (1098) heap_init: At 3FC94220 len 000554F0 (341 KiB): D/IRAM
I (1105) heap_init: At 3FCE9710 len 00005724 (21 KiB): STACK/DRAM
I (1112) heap_init: At 600FE000 len 00002000 (8 KiB): RTCRAM
I (1118) spiram: Adding pool of 8192K of external SPI memory to heap allocator
I (1126) spi_flash: detected chip: gd
I (1130) spi_flash: flash io: qio
I (1135) sleep: Configure to isolate all GPIO pins in sleep state
I (1141) sleep: Enable automatic switching of GPIO sleep configuration
I (1148) cpu_start: Starting scheduler on PRO CPU.
I (0) cpu_start: Starting scheduler on APP CPU.
start !!
model load success !!
array init success !!
interpreter init success !!
ESP-ROM:esp32s3-20210327
Build:Mar 27 2021
rst:0x8 (TG1WDT_SYS_RST),boot:0x8 (SPI_FAST_FLASH_BOOT)
Saved PC:0x42002620
0x42002620: panic_handler at D:/Espressif/frameworks/esp-idf-v4.4.3/components/esp_system/port/panic_handler.c:148 (discriminator 3)

SPIWP:0xee
mode:DIO, clock div:1
load:0x3fce3808,len:0x17c0
load:0x403c9700,len:0xe64
load:0x403cc700,len:0x2fec
entry 0x403c9980
I (29) boot: ESP-IDF v4.4.3-dirty 2nd stage bootloader
I (29) boot: compile time 10:56:11
I (29) boot: chip revision: 0
I (31) qio_mode: Enabling default flash chip QIO
I (37) boot.esp32s3: Boot SPI Speed : 80MHz
I (41) boot.esp32s3: SPI Mode       : QIO
I (46) boot.esp32s3: SPI Flash Size : 16MB
W (51) boot.esp32s3: PRO CPU has been reset by WDT.
W (56) boot.esp32s3: APP CPU has been reset by WDT.
I (62) boot: Enabling RNG early entropy source...
I (67) boot: Partition Table:
I (71) boot: ## Label            Usage          Type ST Offset   Length
I (78) boot:  0 nvs              WiFi data        01 02 00009000 00006000
I (86) boot:  1 phy_init         RF data          01 01 0000f000 00001000
I (93) boot:  2 factory          factory app      00 00 00010000 003f3c50
I (101) boot: End of partition table
I (105) esp_image: segment 0: paddr=00010020 vaddr=3c080020 size=2d61f0h (2974192) map
I (564) esp_image: segment 1: paddr=002e6218 vaddr=3fc90c90 size=029e8h ( 10728) load
I (566) esp_image: segment 2: paddr=002e8c08 vaddr=40374000 size=07410h ( 29712) load
I (576) esp_image: segment 3: paddr=002f0020 vaddr=42000020 size=7a8b4h (501940) map
I (654) esp_image: segment 4: paddr=0036a8dc vaddr=4037b410 size=05878h ( 22648) load
I (659) esp_image: segment 5: paddr=0037015c vaddr=50000000 size=00010h (    16) load
I (665) boot: Loaded app from partition at offset 0x10000
I (666) boot: Disabling RNG early entropy source...
I (683) opi psram: vendor id : 0x0d (AP)
I (683) opi psram: dev id    : 0x02 (generation 3)
I (683) opi psram: density   : 0x03 (64 Mbit)
I (687) opi psram: good-die  : 0x01 (Pass)
I (692) opi psram: Latency   : 0x01 (Fixed)
I (697) opi psram: VCC       : 0x01 (3V)
I (701) opi psram: SRF       : 0x01 (Fast Refresh)
I (707) opi psram: BurstType : 0x01 (Hybrid Wrap)
I (712) opi psram: BurstLen  : 0x01 (32 Byte)
I (717) opi psram: Readlatency  : 0x02 (10 cycles@Fixed)
I (723) opi psram: DriveStrength: 0x00 (1/1)
W (728) PSRAM: DO NOT USE FOR MASS PRODUCTION! Timing parameters will be updated in future IDF version.
I (739) spiram: Found 64MBit SPI RAM device
I (743) spiram: SPI RAM mode: sram 80m
I (747) spiram: PSRAM initialized, cache is in normal (1-core) mode.
I (754) cpu_start: Pro cpu up.
I (758) cpu_start: Starting app cpu, entry point is 0x4037527c
0x4037527c: call_start_cpu1 at D:/Espressif/frameworks/esp-idf-v4.4.3/components/esp_system/port/cpu_start.c:148

ths!

vikramdattu commented 1 year ago

Hi @jiwenfei

  1. Do existing examples run fine of you?
  2. Can you please enable gdb stub and share backtrace output? Need to know where exactly the crash is happening.
  3. Have you designed your op_resolver properly? With structure size = number of OP? e.g,

https://github.com/espressif/tflite-micro-esp-examples/blob/72f2c9914561497cbd6ab5d951a83e2462840596/examples/person_detection/main/main_functions.cc#L85-L90

jiwenfei commented 1 year ago

@vikramdattu
1.the existing examples are ok,

  1. I used the code tflite::AllOpsResolver resolver; to import ops.
vikramdattu commented 1 year ago

Hi @jiwenfei can you try the following:

  1. Increase main task stack from menuconfig to some large number, (by default it is 3K).
  2. Keep gdb stub enabled to catch the reset point.
  3. Alternately, you may offload the function to different task with higher stack like this: https://github.com/espressif/tflite-micro-esp-examples/blob/72f2c9914561497cbd6ab5d951a83e2462840596/examples/person_detection/main/main.cc#L42
jiwenfei commented 1 year ago

@vikramdattu the gdb stub msg is

model load success !!
array init success !!
interpreter init success !!
Guru Meditation Error: Core  / panic'ed (Cache disabled but cached memory region accessed).
Write back error occurred while dcache tries to write back to flash

Core  0 register dump:
PC      : 0x4205f3f8  PS      : 0x00060134  A0      : 0x820563ab  A1      : 0x3fc96450
0x4205f3f8: unsigned short flatbuffers::ReadScalar<unsigned short>(void const*) at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/third_party/flatbuffers/include/flatbuffers/base.h:428
 (inlined by) flatbuffers::Table::GetOptionalFieldOffset(unsigned short) const at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/third_party/flatbuffers/include/flatbuffers/table.h:42
 (inlined by) flatbuffers::Vector<flatbuffers::Offset<tflite::Tensor> > const* flatbuffers::Table::GetPointer<flatbuffers::Vector<flatbuffers::Offset<tflite::Tensor> > const*>(unsigned short) at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/third_party/flatbuffers/include/flatbuffers/table.h:51
 (inlined by) flatbuffers::Vector<flatbuffers::Offset<tflite::Tensor> > const* flatbuffers::Table::GetPointer<flatbuffers::Vector<flatbuffers::Offset<tflite::Tensor> > const*>(unsigned short) const at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/third_party/flatbuffers/include/flatbuffers/table.h:57
 (inlined by) tflite::SubGraph::tensors() const at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/tensorflow/lite/schema/schema_generated.h:12720
 (inlined by) tflite::AllocationInfoBuilder::InitializeAllocationInfo(int const*, tflite::SubgraphAllocations*) at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/tensorflow/lite/micro/micro_allocation_info.cc:192

A2      : 0x3c2c9298  A3      : 0x3d802afc  A4      : 0xffffffff  A5      : 0x00000014
A6      : 0x3d9efcfc  A7      : 0x0000013b  A8      : 0x3c2c928a  A9      : 0x00000000
A10     : 0x0000000e  A11     : 0x0000000e  A12     : 0x00000012  A13     : 0x00000017
A14     : 0x00000001  A15     : 0x0000000e  SAR     : 0x00000001  EXCCAUSE: 0x00000007
EXCVADDR: 0x00000000  LBEG    : 0x420561e9  LEND    : 0x420561f0  LCOUNT  : 0x00000000
0x420561e9: tflite::TfLiteEvalTensorByteLength(TfLiteEvalTensor const*, unsigned int*) at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/tensorflow/lite/micro/memory_helpers.cc:128 (discriminator 2)

0x420561f0: tflite::TfLiteEvalTensorByteLength(TfLiteEvalTensor const*, unsigned int*) at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/tensorflow/lite/micro/memory_helpers.cc:132

Backtrace: 0x4205f3f5:0x3fc96450 0x420563a8:0x3fc964a0 0x42056ea6:0x3fc96500 0x420091e7:0x3fc96530 0x420062c6:0x3fc96560 0x42078642:0x3fc97ba0
0x4205f3f5: tflite::AllocationInfoBuilder::InitializeAllocationInfo(int const*, tflite::SubgraphAllocations*) at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/tensorflow/lite/micro/micro_allocation_info.cc:199

0x420563a8: tflite::MicroAllocator::CommitStaticMemoryPlan(tflite::Model const*, tflite::SubgraphAllocations*, tflite::ScratchBufferHandle*) at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/tensorflow/lite/micro/micro_allocator.cc:822

0x42056ea6: tflite::MicroAllocator::FinishModelAllocation(tflite::Model const*, tflite::SubgraphAllocations*, tflite::ScratchBufferHandle**) at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/tensorflow/lite/micro/micro_allocator.cc:503

0x420091e7: tflite::MicroInterpreter::AllocateTensors() at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/tensorflow/lite/micro/micro_interpreter.cc:217
 (inlined by) tflite::MicroInterpreter::AllocateTensors() at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/tensorflow/lite/micro/micro_interpreter.cc:182

0x420062c6: app_main at E:/esp32work/tflite-micro-esp-examples/examples/person_detection/main/main.cc:241301

0x42078642: main_task at D:/Espressif/frameworks/esp-idf-v4.4.3/components/freertos/port/port_common.c:141
jiwenfei commented 1 year ago

and the core dump is

Initiating core dump!
I (1241) esp_core_dump_uart: Press Enter to print core dump to UART...
I (1248) esp_core_dump_uart: Print core dump to uart...
Core dump started (further output muted)
Received  17 kB...
Core dump finished!
espcoredump.py v0.4-dev
===============================================================
==================== ESP32 CORE DUMP START ====================

Crashed task handle: 0x3fceb088, name: 'tf_main', GDB name: 'process 1070510216'

================== CURRENT THREAD REGISTERS ===================
exccause       0x1d (StoreProhibitedCause)
excvaddr       0x3780d
epc1           0x420675ef
epc2           0x0
epc3           0x0
epc4           0x0
epc5           0x0
epc6           0x0
eps2           0x0
eps3           0x0
eps4           0x0
eps5           0x0
eps6           0x0
pc             0x4037d70a          0x4037d70a <tlsf_malloc+270>
lbeg           0x400556d5          1074091733
lend           0x400556e5          1074091749
lcount         0xfffffffd          4294967293
sar            0x9                 9
ps             0x60e23             396835
threadptr      <unavailable>
br             <unavailable>
scompare1      <unavailable>
acclo          <unavailable>
acchi          <unavailable>
m0             <unavailable>
m1             <unavailable>
m2             <unavailable>
m3             <unavailable>
expstate       <unavailable>
f64r_lo        <unavailable>
f64r_hi        <unavailable>
f64s           <unavailable>
fcr            <unavailable>
fsr            <unavailable>
a0             0x8037df1f          -2143822049
a1             0x3fce9610          1070503440
a2             0x3fc90f58          1070141272
a3             0x37801             227329
a4             0x1e                30
a5             0x80                128
a6             0xfffffff8          -8
a7             0x1                 1
a8             0x3c350b9c          1010109340
a9             0x28000             163840
a10            0x206b6c62          543911010
a11            0x5c                92
a12            0x28002             163842
a13            0x3fc90fd0          1070141392
a14            0x0                 0
a15            0x3fc90f50          1070141264

==================== CURRENT THREAD STACK =====================
#0  remove_free_block (sl=30, fl=0, block=0x3c350b9c <OCODE>, control=0x3fc90f58 <lock_init_spinlock>) at D:/Espressif/frameworks/esp-idf-v4.4.3/components/heap/heap_tlsf.c:213
#1  block_locate_free (size=92, control=0x3fc90f58 <lock_init_spinlock>) at D:/Espressif/frameworks/esp-idf-v4.4.3/components/heap/heap_tlsf.c:448
#2  tlsf_malloc (tlsf=0x3fc90f58 <lock_init_spinlock>, size=<optimized out>) at D:/Espressif/frameworks/esp-idf-v4.4.3/components/heap/heap_tlsf.c:779
#3  0x4037df1f in multi_heap_malloc_impl (heap=0x3fce9710, size=92) at D:/Espressif/frameworks/esp-idf-v4.4.3/components/heap/multi_heap.c:197
#4  0x40375a05 in heap_caps_malloc_base (size=92, caps=6144) at D:/Espressif/frameworks/esp-idf-v4.4.3/components/heap/heap_caps.c:154
#5  0x40375a3d in heap_caps_malloc_base (caps=6144, size=92) at D:/Espressif/frameworks/esp-idf-v4.4.3/components/heap/heap_caps.c:177
#6  heap_caps_malloc (size=92, caps=6144) at D:/Espressif/frameworks/esp-idf-v4.4.3/components/heap/heap_caps.c:174
#7  0x40375a6c in heap_caps_malloc_default (size=92) at D:/Espressif/frameworks/esp-idf-v4.4.3/components/heap/heap_caps.c:199
#8  0x4037e378 in malloc (size=92) at D:/Espressif/frameworks/esp-idf-v4.4.3/components/newlib/heap.c:24
#9  0x403797ef in xQueueGenericCreate (uxQueueLength=1, uxItemSize=0, ucQueueType=4 '\\004') at D:/Espressif/frameworks/esp-idf-v4.4.3/components/freertos/queue.c:447
#10 0x40379ad7 in xQueueCreateMutex (ucQueueType=4 '\\004') at D:/Espressif/frameworks/esp-idf-v4.4.3/components/freertos/queue.c:564
#11 0x403768c7 in lock_init_generic (lock=0x3fc90444 <s_context+12>, mutex_type=4 '\\004') at D:/Espressif/frameworks/esp-idf-v4.4.3/components/newlib/locks.c:73
#12 0x403768f3 in lock_acquire_generic (lock=0x3fc90444 <s_context+12>, delay=4294967295, mutex_type=4 '\\004') at D:/Espressif/frameworks/esp-idf-v4.4.3/components/newlib/locks.c:127
#13 0x40376a38 in _lock_acquire_recursive (lock=0x3fc90444 <s_context+12>) at D:/Espressif/frameworks/esp-idf-v4.4.3/components/newlib/locks.c:167
#14 0x42004726 in uart_write (fd=0, data=0x3fc958f8, size=9) at D:/Espressif/frameworks/esp-idf-v4.4.3/components/vfs/vfs_uart.c:200
#15 0x42003548 in console_write (fd=<optimized out>, data=0x3fc958f8, size=9) at D:/Espressif/frameworks/esp-idf-v4.4.3/components/vfs/vfs_console.c:73
#16 0x42003100 in esp_vfs_write (r=<optimized out>, fd=<optimized out>, data=0x3fc958f8, size=9) at D:/Espressif/frameworks/esp-idf-v4.4.3/components/vfs/vfs.c:431
#17 0x4206b5e4 in __swrite (ptr=0x3fceb0f4, cookie=0x3fc9569c, buf=0x3fc958f8 <error: Cannot access memory at address 0x3fc958f8>, n=9) at /builds/idf/crosstool-NG/.build/HOST-x86_64-w64-mingw32/xtensa-esp32s3-elf/src/newlib/newlib/libc/stdio/stdio.c:94
#18 0x420700d9 in __sflush_r (ptr=0x3fceb0f4, fp=0x3fc9569c) at /builds/idf/crosstool-NG/.build/HOST-x86_64-w64-mingw32/xtensa-esp32s3-elf/src/newlib/newlib/libc/stdio/fflush.c:224
#19 0x42070161 in _fflush_r (ptr=0x3fceb0f4, fp=0x3fc9569c) at /builds/idf/crosstool-NG/.build/HOST-x86_64-w64-mingw32/xtensa-esp32s3-elf/src/newlib/newlib/libc/stdio/fflush.c:278
#20 0x4206b0f0 in __sfvwrite_r (ptr=0x3fceb0f4, fp=0x3fc9569c, uio=0x3fce98b0) at /builds/idf/crosstool-NG/.build/HOST-x86_64-w64-mingw32/xtensa-esp32s3-elf/src/newlib/newlib/libc/stdio/fvwrite.c:251
#21 0x4206b311 in _puts_r (ptr=0x3fceb0f4, s=<optimized out>) at /builds/idf/crosstool-NG/.build/HOST-x86_64-w64-mingw32/xtensa-esp32s3-elf/src/newlib/newlib/libc/stdio/puts.c:91
#22 0x4206b346 in puts (s=0x3c083168 \"start !!\") at /builds/idf/crosstool-NG/.build/HOST-x86_64-w64-mingw32/xtensa-esp32s3-elf/src/newlib/newlib/libc/stdio/puts.c:129
#23 0x42006219 in tf_main () at E:/esp32work/tflite-micro-esp-examples/examples/person_detection/main/main.cc:241273

======================== THREADS INFO =========================
  Id   Target Id          Frame
* 1    process 1070510216 remove_free_block (sl=30, fl=0, block=0x3c350b9c <OCODE>, control=0x3fc90f58 <lock_init_spinlock>) at D:/Espressif/frameworks/esp-idf-v4.4.3/components/heap/heap_tlsf.c:213
  2    process 1070169360 0x400559e0 in ?? ()
  3    process 1070171260 prvIdleTask (pvParameters=0x0) at D:/Espressif/frameworks/esp-idf-v4.4.3/components/freertos/tasks.c:3928
  4    process 1070164820 0x400559e0 in ?? ()

==================== THREAD 1 (TCB: 0x3fceb088, name: 'tf_main') =====================
#0  remove_free_block (sl=30, fl=0, block=0x3c350b9c <OCODE>, control=0x3fc90f58 <lock_init_spinlock>) at D:/Espressif/frameworks/esp-idf-v4.4.3/components/heap/heap_tlsf.c:213
#1  block_locate_free (size=92, control=0x3fc90f58 <lock_init_spinlock>) at D:/Espressif/frameworks/esp-idf-v4.4.3/components/heap/heap_tlsf.c:448
#2  tlsf_malloc (tlsf=0x3fc90f58 <lock_init_spinlock>, size=<optimized out>) at D:/Espressif/frameworks/esp-idf-v4.4.3/components/heap/heap_tlsf.c:779
#3  0x4037df1f in multi_heap_malloc_impl (heap=0x3fce9710, size=92) at D:/Espressif/frameworks/esp-idf-v4.4.3/components/heap/multi_heap.c:197
#4  0x40375a05 in heap_caps_malloc_base (size=92, caps=6144) at D:/Espressif/frameworks/esp-idf-v4.4.3/components/heap/heap_caps.c:154
#5  0x40375a3d in heap_caps_malloc_base (caps=6144, size=92) at D:/Espressif/frameworks/esp-idf-v4.4.3/components/heap/heap_caps.c:177
#6  heap_caps_malloc (size=92, caps=6144) at D:/Espressif/frameworks/esp-idf-v4.4.3/components/heap/heap_caps.c:174
#7  0x40375a6c in heap_caps_malloc_default (size=92) at D:/Espressif/frameworks/esp-idf-v4.4.3/components/heap/heap_caps.c:199
#8  0x4037e378 in malloc (size=92) at D:/Espressif/frameworks/esp-idf-v4.4.3/components/newlib/heap.c:24
#9  0x403797ef in xQueueGenericCreate (uxQueueLength=1, uxItemSize=0, ucQueueType=4 '\\004') at D:/Espressif/frameworks/esp-idf-v4.4.3/components/freertos/queue.c:447
#10 0x40379ad7 in xQueueCreateMutex (ucQueueType=4 '\\004') at D:/Espressif/frameworks/esp-idf-v4.4.3/components/freertos/queue.c:564
#11 0x403768c7 in lock_init_generic (lock=0x3fc90444 <s_context+12>, mutex_type=4 '\\004') at D:/Espressif/frameworks/esp-idf-v4.4.3/components/newlib/locks.c:73
#12 0x403768f3 in lock_acquire_generic (lock=0x3fc90444 <s_context+12>, delay=4294967295, mutex_type=4 '\\004') at D:/Espressif/frameworks/esp-idf-v4.4.3/components/newlib/locks.c:127
#13 0x40376a38 in _lock_acquire_recursive (lock=0x3fc90444 <s_context+12>) at D:/Espressif/frameworks/esp-idf-v4.4.3/components/newlib/locks.c:167
#14 0x42004726 in uart_write (fd=0, data=0x3fc958f8, size=9) at D:/Espressif/frameworks/esp-idf-v4.4.3/components/vfs/vfs_uart.c:200
#15 0x42003548 in console_write (fd=<optimized out>, data=0x3fc958f8, size=9) at D:/Espressif/frameworks/esp-idf-v4.4.3/components/vfs/vfs_console.c:73
#16 0x42003100 in esp_vfs_write (r=<optimized out>, fd=<optimized out>, data=0x3fc958f8, size=9) at D:/Espressif/frameworks/esp-idf-v4.4.3/components/vfs/vfs.c:431
#17 0x4206b5e4 in __swrite (ptr=0x3fceb0f4, cookie=0x3fc9569c, buf=0x3fc958f8 <error: Cannot access memory at address 0x3fc958f8>, n=9) at /builds/idf/crosstool-NG/.build/HOST-x86_64-w64-mingw32/xtensa-esp32s3-elf/src/newlib/newlib/libc/stdio/stdio.c:94
#18 0x420700d9 in __sflush_r (ptr=0x3fceb0f4, fp=0x3fc9569c) at /builds/idf/crosstool-NG/.build/HOST-x86_64-w64-mingw32/xtensa-esp32s3-elf/src/newlib/newlib/libc/stdio/fflush.c:224
#19 0x42070161 in _fflush_r (ptr=0x3fceb0f4, fp=0x3fc9569c) at /builds/idf/crosstool-NG/.build/HOST-x86_64-w64-mingw32/xtensa-esp32s3-elf/src/newlib/newlib/libc/stdio/fflush.c:278
#20 0x4206b0f0 in __sfvwrite_r (ptr=0x3fceb0f4, fp=0x3fc9569c, uio=0x3fce98b0) at /builds/idf/crosstool-NG/.build/HOST-x86_64-w64-mingw32/xtensa-esp32s3-elf/src/newlib/newlib/libc/stdio/fvwrite.c:251
#21 0x4206b311 in _puts_r (ptr=0x3fceb0f4, s=<optimized out>) at /builds/idf/crosstool-NG/.build/HOST-x86_64-w64-mingw32/xtensa-esp32s3-elf/src/newlib/newlib/libc/stdio/puts.c:91
#22 0x4206b346 in puts (s=0x3c083168 \"start !!\") at /builds/idf/crosstool-NG/.build/HOST-x86_64-w64-mingw32/xtensa-esp32s3-elf/src/newlib/newlib/libc/stdio/puts.c:129
#23 0x42006219 in tf_main () at E:/esp32work/tflite-micro-esp-examples/examples/person_detection/main/main.cc:241273Exception in thread Thread-1:
Traceback (most recent call last):
  File "threading.py", line 932, in _bootstrap_inner
  File "threading.py", line 870, in run
  File "subprocess.py", line 1366, in _readerthread
OSError: [Errno 22] Invalid argument
WARNING: Attempt to terminate the GDB process failed, because it is already terminated. Skip

==================== THREAD 2 (TCB: 0x3fc97d10, name: 'main') =====================
#0  0x400559e0 in ?? ()
#1  0x4037b92d in vPortClearInterruptMaskFromISR (prev_level=<optimized out>) at D:/Espressif/frameworks/esp-idf-v4.4.3/components/freertos/port/xtensa/include/freertos/portmacro.h:571
#2  vPortExitCritical (mux=0x3fc90f50 <xTaskQueueMutex>) at D:/Espressif/frameworks/esp-idf-v4.4.3/components/freertos/port/xtensa/port.c:319
#3  0x4037a2da in prvAddNewTaskToReadyList (pxNewTCB=0x3fceb088, xCoreID=2147483647, pxTaskCode=0x42006210 <tf_main()>) at D:/Espressif/frameworks/esp-idf-v4.4.3/components/freertos/tasks.c:1312
#4  0x4037a3e6 in xTaskCreatePinnedToCore (pvTaskCode=0x42006210 <tf_main()>, pcName=0x3c083270 \"tf_main\", usStackDepth=4096, pvParameters=0x0, uxPriority=8, pvCreatedTask=0x0, xCoreID=xCoreID@entry=2147483647) at D:/Espressif/frameworks/esp-idf-v4.4.3/components/freertos/tasks.c:900
#5  0x42006321 in xTaskCreate (pxCreatedTask=0x0, uxPriority=8, pvParameters=0x0, usStackDepth=4096, pcName=0x3c083270 \"tf_main\", pvTaskCode=0x42006210 <tf_main()>) at D:/Espressif/frameworks/esp-idf-v4.4.3/components/freertos/include/freertos/task.h:450
#6  app_main () at E:/esp32work/tflite-micro-esp-examples/examples/person_detection/main/main.cc:44704
#7  0x42078675 in main_task (args=0x0) at D:/Espressif/frameworks/esp-idf-v4.4.3/components/freertos/port/port_common.c:141

==================== THREAD 3 (TCB: 0x3fc9847c, name: 'IDLE') =====================
#0  prvIdleTask (pvParameters=0x0) at D:/Espressif/frameworks/esp-idf-v4.4.3/components/freertos/tasks.c:3928
#1  0x40000000 in ?? ()

==================== THREAD 4 (TCB: 0x3fc96b54, name: 'esp_timer') =====================
#0  0x400559e0 in ?? ()
#1  0x4037b92d in vPortClearInterruptMaskFromISR (prev_level=<optimized out>) at D:/Espressif/frameworks/esp-idf-v4.4.3/components/freertos/port/xtensa/include/freertos/portmacro.h:571
#2  vPortExitCritical (mux=0x3fc90f50 <xTaskQueueMutex>) at D:/Espressif/frameworks/esp-idf-v4.4.3/components/freertos/port/xtensa/port.c:319
#3  0x4037b241 in ulTaskGenericNotifyTake (uxIndexToWait=<optimized out>, xClearCountOnExit=1, xTicksToWait=4294967295) at D:/Espressif/frameworks/esp-idf-v4.4.3/components/freertos/tasks.c:5401
#4  0x420059c9 in timer_task (arg=<optimized out>) at D:/Espressif/frameworks/esp-idf-v4.4.3/components/esp_timer/src/esp_timer.c:384

======================= ALL MEMORY REGIONS ========================
Name   Address   Size   Attrs
.rtc.text 0x600fe000 0x0 RW
.rtc.dummy 0x600fe000 0x0 RW
.rtc.force_fast 0x600fe000 0x0 RW
.rtc.data 0x50000000 0x10 RW A
.rtc_noinit 0x50000010 0x0 RW
.rtc.force_slow 0x50000010 0x0 RW
.iram0.vectors 0x40374000 0x403 R XA
.iram0.text 0x40374404 0xbf0b R XA
.dram0.data 0x3fc90310 0x32cc RW A
.noinit 0x3fc935dc 0x0 RW
.flash.text 0x42000020 0x787eb R XA
.flash.appdesc 0x3c080020 0x100 R  A
.flash.rodata 0x3c080120 0x2d59fc RW A
.flash.rodata_noload 0x3c355b1c 0x0 RW
.iram0.text_end 0x4038030f 0x0 RW
.iram0.bss 0x40380310 0x0 RW
.dram0.heap_start 0x3fc949c0 0x0 RW
.coredump.tasks.data 0x3fceb088 0x164 RW
.coredump.tasks.data 0x3fce9550 0x1b30 RW
.coredump.tasks.data 0x3fc97d10 0x164 RW
.coredump.tasks.data 0x3fc97a20 0x2e8 RW
.coredump.tasks.data 0x3fc9847c 0x164 RW
.coredump.tasks.data 0x3fc98270 0x204 RW
.coredump.tasks.data 0x3fc96b54 0x164 RW
.coredump.tasks.data 0x3fc968d0 0x27c RW

===================== ESP32 CORE DUMP END =====================
===============================================================
Done!
jiwenfei commented 1 year ago

Hi @jiwenfei can you try the following:

  1. Increase main task stack from menuconfig to some large number, (by default it is 3K).
  2. Keep gdb stub enabled to catch the reset point.
  3. Alternately, you may offload the function to different task with higher stack like this: https://github.com/espressif/tflite-micro-esp-examples/blob/72f2c9914561497cbd6ab5d951a83e2462840596/examples/person_detection/main/main.cc#L42

I have written the code as you note, but there is nothing changed.

vikramdattu commented 1 year ago

Hi @jiwenfei I converted the above model and tried to run. I am getting similar issues if not exact.

There are few issues:

  1. Below lines are illegal as these try to modify data from const array(the model data): https://github.com/espressif/tflite-micro-esp-examples/blob/388c558a2174deb7d2c83efd244e2b4d30636edf/components/tflite-lib/tensorflow/lite/micro/kernels/gather_nd.cc#L93-L100

Please comment those out before running the code.

  1. After this the code will run but when you invoke the function you get type not supported errors:
Input type UINT8 (3) not supported.

Type 'INT32' is not supported by FLOOR_DIV.
Node FLOOR_DIV (number 90) failed to invoke with status 1

The model is not compatible with tflite-micro and must be outdated!

What you can do is quantise the model yourself with following options and try:

def representative_dataset():
  for i in range(100):
    yield [tf.random.uniform(shape=[1,1,192,192], minval=.0, maxval=1.)]

# make a converter object from the saved tensorflow file
converter = tf.compat.v1.lite.TFLiteConverter.from_saved_model('person_det_opt.pb/' #TensorFlow freezegraph .pb model file
                                                      )
# tell converter which type of optimization techniques to use
converter.optimizations = [tf.lite.Optimize.DEFAULT]
# to view the best option for optimization read documentation of tflite about optimization
# go to this link https://www.tensorflow.org/lite/guide/get_started#4_optimize_your_model_optional

# This is must when converting with `TFLITE_BUILTINS_INT8`
converter.representative_dataset = representative_dataset
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8, tf.lite.OpsSet.SELECT_TF_OPS]
converter.inference_input_type = tf.int8
converter.inference_output_type = tf.int8
jiwenfei commented 1 year ago

@vikramdattu The tf model movenet have been converted into micro-tffile as your code, like this:

def representative_dataset():
  for i in range(100):
    # tf.cast(tf.squeeze(tf.random_uniform((1, 1)))*2, dtype=tf.int32)
    yield [tf.cast(tf.random.uniform(shape=[1,192,192,3], minval=.0, maxval=1.)*255,dtype=tf.int32)]

# make a converter object from the saved tensorflow file
converter = tf.compat.v1.lite.TFLiteConverter.from_saved_model('/content/4/' #TensorFlow freezegraph .pb model file
                                                      )
# tell converter which type of optimization techniques to use
converter.optimizations = [tf.lite.Optimize.DEFAULT]
# to view the best option for optimization read documentation of tflite about optimization
# go to this link https://www.tensorflow.org/lite/guide/get_started#4_optimize_your_model_optional

# This is must when converting with `TFLITE_BUILTINS_INT8`
converter.representative_dataset = representative_dataset
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8, tf.lite.OpsSet.SELECT_TF_OPS]
converter.inference_input_type = tf.int8
converter.inference_output_type = tf.int8
model_tflite = converter.convert()

# Save the model to disk
open('model.tflite', "wb").write(model_tflite)

But it also showed errors like this image

vikramdattu commented 1 year ago

Hi @jiwenfei

https://github.com/espressif/tflite-micro-esp-examples/blob/388c558a2174deb7d2c83efd244e2b4d30636edf/components/tflite-lib/tensorflow/lite/micro/kernels/floor_div.cc#L112-L122

This particular OP is not implemented by tflite-micro in other data types. Probably, nobody came across it.

Maybe just calling with data type int32_t will work. Please try.

  switch (input1->type) {
    case kTfLiteFloat32: {
      return EvalFloorDiv<float>(context, input1, input2, output);
    }
    case kTfLiteInt32: {
      return EvalFloorDiv<int>(context, input1, input2, output);
    }
    default: {
      MicroPrintf("Type '%s' is not supported by FLOOR_DIV.",
                  TfLiteTypeGetName(input1->type));
      return kTfLiteError;
    }
  }

You may want to contribute this back if it works to the repo: https://github.com/tensorflow/tflite-micro

For stack overflow error, please increase the task stack and check.

jiwenfei commented 1 year ago

@vikramdattu Ths! It newly shows that The SUB op not supported INT32, I will do some change as you note above methods
image

jiwenfei commented 1 year ago

Hi @jiwenfei

https://github.com/espressif/tflite-micro-esp-examples/blob/388c558a2174deb7d2c83efd244e2b4d30636edf/components/tflite-lib/tensorflow/lite/micro/kernels/floor_div.cc#L112-L122

This particular OP is not implemented by tflite-micro in other data types. Probably, nobody came across it.

Maybe just calling with data type int32_t will work. Please try.

  switch (input1->type) {
    case kTfLiteFloat32: {
      return EvalFloorDiv<float>(context, input1, input2, output);
    }
    case kTfLiteInt32: {
      return EvalFloorDiv<int>(context, input1, input2, output);
    }
    default: {
      MicroPrintf("Type '%s' is not supported by FLOOR_DIV.",
                  TfLiteTypeGetName(input1->type));
      return kTfLiteError;
    }
  }

You may want to contribute this back if it works to the repo: https://github.com/tensorflow/tflite-micro

For stack overflow error, please increase the task stack and check.

It works well, after I update SUB and FLOOR_DIV ops implements as you said.

vikramdattu commented 1 year ago

Hi @jiwenfei does that mean, you are getting expected outputs as well? What FPS do you get with the solution?

jiwenfei commented 1 year ago

Hi @jiwenfei does that mean, you are getting expected outputs as well? What FPS do you get with the solution? Yes, model init time:105ms, and model invoke time : 3007ms.

So I want connect two esp32 modules in seris. The first one is used to detect for good person imgage, and the last is used to dectect keypoint for body

jiwenfei commented 1 year ago

@vikramdattu Can you give me some suggestion to cutdown the invoke time for this movenet mobile?

vikramdattu commented 1 year ago

Hi @jiwenfei as I can see esp-nn has already has added about 10x improvements. Below flag can be removed to see raw performance: https://github.com/espressif/tflite-micro-esp-examples/blob/388c558a2174deb7d2c83efd244e2b4d30636edf/components/tflite-lib/CMakeLists.txt#L92

Few things

  1. I have seen that the memory requirements and flash requirements for this model are quite huge. (~2MB each). If you could modify the model and use a lighter one, it would save both in flash and performance. (I would recommend a model which has size around 1 MB or less for smooth functional. Even that should be good enough for a micro-controller).
  2. I have not checked for if more performance could be sqeezed "for this specific model" (Not checked for which specific layer is eating most of the time etc.), since I believe the optimisations are already top notch and the exercise might not lead to huge improvements if any. I might be wrong.

Also, I haven't tried to quantise the float model from here myself and have used the quantized int8 model provided by google (which might be optimised long time in the past). Maybe manually quantising this now, might result in better optimised model?

Hope this helps. Let me know if I can help you in the process.

jiwenfei commented 1 year ago

OK,I will think and try.

vikramdattu commented 1 year ago

@jiwenfei in addition to above comment, can you please make sure:

  1. Your module is running in Octal mode if that's available? For flash as well as RAM?
  2. Check if Data Cache is set to 64KB and I cache is set to 32KB?

Basically, let's check if the chip is running into the full performance mode.

jiwenfei commented 1 year ago
  1. Your module is running in Octal mode if that's available? For flash as well as RAM?

NOT Available while it is Octal mode

jiwenfei commented 1 year ago

Hi @vikramdattu Now, I have some new questions: 1.While the input data is from the camera on the board, the function interpreter->Invoke() sometimes throw errors like this:

E (222810) task_wdt: Task watchdog got triggered. The following tasks did not reset the watchdog in time:
E (222810) task_wdt:  - IDLE (CPU 0)
E (222810) task_wdt: Tasks currently running:
E (222810) task_wdt: CPU 0: tf_main
E (222810) task_wdt: CPU 1: IDLE
E (222810) task_wdt: Print CPU 0 (current core) backtrace

Backtrace: 0x42079CD2:0x3FC9C850 0x4037B051:0x3FC9C880 0x420700FB:0x3FCC99C0 0x4206E306:0x3FCC9A60 0x4205A9FE:0x3FCC9B70 0x42064746:0x3FCC9F10 0x4201537A:0x3FCC9F40 0x4200808D:0x3FCC9F60 0x42007D63:0x3FCC9FA0
0x42079cd2: task_wdt_isr at D:/Espressif/frameworks/esp-idf-v4.4.3/components/esp_system/task_wdt.c:183 (discriminator 3)

0x4037b051: _xt_lowint1 at D:/Espressif/frameworks/esp-idf-v4.4.3/components/freertos/port/xtensa/xtensa_vectors.S:1111

0x420700fb: .out_ch_loop at E:/esp32work/tflite-micro-esp-examples/components/esp-nn/src/convolution/esp_nn_conv_s8_mult8_1x1_esp32s3.S:178

0x4206e306: esp_nn_conv_s8_esp32s3 at E:/esp32work/tflite-micro-esp-examples/components/esp-nn/src/convolution/esp_nn_conv_esp32s3.c:423

0x4205a9fe: tflite::(anonymous namespace)::Eval(TfLiteContext*, TfLiteNode*) at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/tensorflow/lite/micro/kernels/esp_nn/conv.cc:231
 (inlined by) Eval at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/tensorflow/lite/micro/kernels/esp_nn/conv.cc:294

0x42064746: tflite::MicroGraph::InvokeSubgraph(int) at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/tensorflow/lite/micro/micro_graph.cc:172

0x4201537a: tflite::MicroInterpreter::Invoke() at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/tensorflow/lite/micro/micro_interpreter.cc:284

0x4200808d: loop at E:/esp32work/tflite-micro-esp-examples/examples/person_detection/main/main_functions.cc:151

0x42007d63: tf_main() at E:/esp32work/tflite-micro-esp-examples/examples/person_detection/main/main.cc:31 (discriminator 1)

E (222810) task_wdt: Print CPU 1 backtrace

Backtrace: 0x4037D2CD:0x3FC9CE60 0x4037B051:0x3FC9CE80 0x400559DD:0x3FCB1650 |<-CORRUPTED
0x4037d2cd: esp_crosscore_isr at D:/Espressif/frameworks/esp-idf-v4.4.3/components/esp_system/crosscore_int.c:92

0x4037b051: _xt_lowint1 at D:/Espressif/frameworks/esp-idf-v4.4.3/components/freertos/port/xtensa/xtensa_vectors.S:1111

And my whole code is

void loop() {
  // Get image from provider.
  long long start_time = esp_timer_get_time();
  camera_fb_t* fb = esp_camera_fb_get();
  for (int i = 0; i < kNumRows; i++) {
    for (int j = 0; j < kNumCols; j++) {
      int idx= (i * kNumCols + j)*3;
      uint16_t pixel = ((uint16_t *) (fb->buf))[idx];

      // for inference
      uint8_t hb = pixel & 0xFF;
      uint8_t lb = pixel >> 8;
      // uint8_t r = (lb & 0x1F) << 3;
      // uint8_t g = ((hb & 0x07) << 5) | ((lb & 0xE0) >> 3);
      // uint8_t b = (hb & 0xF8);

      input->data.int8[idx] =(int8_t) ((lb & 0x1F) << 3);
      input->data.int8[idx+1] =(int8_t)(((hb & 0x07) << 5) | ((lb & 0xE0) >> 3));
      input->data.int8[idx+2] =(int8_t)(hb & 0xF8);
    }
 }
  esp_camera_fb_return(fb);
  // printf("img Loop unfold time is %lld\n",esp_timer_get_time() - start_time);

  if (kTfLiteOk != interpreter->Invoke()) {
    MicroPrintf("Invoke failed.");
    // return ;
  }
  printf("until now,model launch time is %lld\n",esp_timer_get_time() - start_time);

  TfLiteTensor* output = interpreter->output(0); 

  float tmpbb= output->params.zero_point*output->params.scale;
  printf("delta=%4f,scale=%4f\n",tmpbb,output->params.scale);
  // // Dequantize the output from integer to floating-point
  for(uint8_t i=0;i<17*3;i=i+3){
      float x = output->data.int8[i]*output->params.scale-tmpbb;
      float y = output->data.int8[i+1]*output->params.scale-tmpbb;
      float score = output->data.int8[i+2]*output->params.scale-tmpbb;
          // void     fb_gfx_fillRect     (camera_fb_t *fb, int32_t x, int32_t y, int32_t w, int32_t h, uint32_t color);
    printf("point[%d],x=%d,y=%d,score=%f\n",i, (int32_t)(x*192), (int32_t)(y*192),score);
    fb_gfx_fillRect(fb,(int32_t)(y*192-2),(int32_t)( x*192-2),5,5,0x1f << 6);
    // fb_gfx_fillRect(fb,10,10,100,100,0x1f << 6);
  }

  xQueueSend(outHandleQueue, &fb, portMAX_DELAY);

  vTaskDelay(1); // to avoid watchdog trigger
}
vikramdattu commented 1 year ago

@jiwenfei one loop of the function it seems is larger than the WDT configured and hence causing the issue. Can you please increase this value to accommodate the whole run time? (idf.py menuconfig > component config > freeRTOS > Task watchdog timeout). Also, are you using the display? If not, you may want to remove the display code.

jiwenfei commented 1 year ago

@vikramdattu After I set a new number of watchdog timeout, It says that tensorlite's unpack op is not right, the terminal msg is:

Guru Meditation Error: Core  0 panic'ed (StoreProhibited). Exception was unhandled.

Core  0 register dump:
PC      : 0x42057941  PS      : 0x00060e30  A0      : 0x82064749  A1      : 0x3fcc9ea0
0x42057941: tflite::ops::micro::unpack::(anonymous namespace)::Eval(TfLiteContext*, TfLiteNode*) at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/tensorflow/lite/micro/kernels/unpack.cc:70
 (inlined by) Eval at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/tensorflow/lite/micro/kernels/unpack.cc:92

A2      : 0xc007c007  A3      : 0x00000001  A4      : 0x3d800980  A5      : 0x00000001
A6      : 0x3d800981  A7      : 0x00000001  A8      : 0xc007c007  A9      : 0x00000001
A10     : 0x00000000  A11     : 0x00000001  A12     : 0x3d800982  A13     : 0xc007c007
A14     : 0x000000e9  A15     : 0x3d800981  SAR     : 0x00000007  EXCCAUSE: 0x0000001d
EXCVADDR: 0xc007c007  LBEG    : 0x420577c4  LEND    : 0x420577cb  LCOUNT  : 0x00000000
0x420577c4: tflite::ops::micro::unpack::(anonymous namespace)::Eval(TfLiteContext*, TfLiteNode*) at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/tensorflow/lite/micro/kernels/unpack.cc:49
 (inlined by) Eval at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/tensorflow/lite/micro/kernels/unpack.cc:92

0x420577cb: tflite::ops::micro::unpack::(anonymous namespace)::Eval(TfLiteContext*, TfLiteNode*) at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/tensorflow/lite/micro/kernels/unpack.cc:48
 (inlined by) Eval at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/tensorflow/lite/micro/kernels/unpack.cc:92

Backtrace: 0x4205793e:0x3fcc9ea0 0x42064746:0x3fcc9f00 0x4201537a:0x3fcc9f30 0x4200808d:0x3fcc9f50 0x42007d63:0x3fcc9f90
0x4205793e: tflite::ops::micro::unpack::(anonymous namespace)::Eval(TfLiteContext*, TfLiteNode*) at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/tensorflow/lite/micro/kernels/unpack.cc:70
 (inlined by) Eval at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/tensorflow/lite/micro/kernels/unpack.cc:92

0x42064746: tflite::MicroGraph::InvokeSubgraph(int) at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/tensorflow/lite/micro/micro_graph.cc:172

0x4201537a: tflite::MicroInterpreter::Invoke() at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/tensorflow/lite/micro/micro_interpreter.cc:284

0x4200808d: loop at E:/esp32work/tflite-micro-esp-examples/examples/person_detection/main/main_functions.cc:151

0x42007d63: tf_main() at E:/esp32work/tflite-micro-esp-examples/examples/person_detection/main/main.cc:31 (discriminator 1)

ELF file SHA256: b7537bd942474c98
vikramdattu commented 1 year ago

Hi @jiwenfei I missed your last message!

Do you still face the issues?

By the way, the gather_nd issue is now fixed on the latest sync. So, you may want to update to it by doing git pull --rebase and continue your development.

jiwenfei commented 1 year ago

Hi @jiwenfei I missed your last message!

Do you still face the issues?

By the way, the gather_nd issue is now fixed on the latest sync. So, you may want to update to it by doing git pull --rebase and continue your development. @vikramdattu No,But new issue:


Core  0 register dump:
PC      : 0x4205ae0b  PS      : 0x00060e30  A0      : 0x8200b19d  A1      : 0x3fcd9770
0x4205ae0b: tflite::MicroGraph::InvokeSubgraph(int) at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/tensorflow/lite/micro/micro_graph.cc:178

A2 : 0x00000000 A3 : 0x00000000 A4 : 0x00000000 A5 : 0x3fcd8360 A6 : 0x00000000 A7 : 0x3fca2320 A8 : 0x8205ae05 A9 : 0x073f4bc0 A10 : 0x3dad5500 A11 : 0xe227f4f1 A12 : 0x3d8d8980 A13 : 0x00000001 A14 : 0x00000000 A15 : 0x00000000 SAR : 0x00000003 EXCCAUSE: 0x0000001c EXCVADDR: 0x073f4bcc LBEG : 0x4200e92e LEND : 0x4200e93a LCOUNT : 0x00000000 0x4200e92e: tflite::(anonymous namespace)::Eval(TfLiteContext, TfLiteNode) at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/tensorflow/lite/micro/kernels/cast.cc:50 (inlined by) transform<int const, float, tflite::(anonymous namespace)::copyCast(const FromT, ToT, int) [with FromT = int; ToT = float]::<lambda(int)> > at d:\espressif\tools\xtensa-esp32s3-elf\esp-2021r2-patch5-8.4.0\xtensa-esp32s3-elf\xtensa-esp32s3-elf\include\c++\8.4.0\bits/stl_algo.h:4304 (inlined by) copyCast<int, float> at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/tensorflow/lite/micro/kernels/cast.cc:49 (inlined by) copyToTensor at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/tensorflow/lite/micro/kernels/cast.cc:67 (inlined by) Eval at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/tensorflow/lite/micro/kernels/cast.cc:92

0x4200e93a: tflite::(anonymous namespace)::Eval(TfLiteContext, TfLiteNode) at d:\espressif\tools\xtensa-esp32s3-elf\esp-2021r2-patch5-8.4.0\xtensa-esp32s3-elf\xtensa-esp32s3-elf\include\c++\8.4.0\bits/stl_algo.h:4303 (inlined by) copyCast<int, float> at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/tensorflow/lite/micro/kernels/cast.cc:49 (inlined by) copyToTensor at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/tensorflow/lite/micro/kernels/cast.cc:67 (inlined by) Eval at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/tensorflow/lite/micro/kernels/cast.cc:92

Backtrace: 0x4205ae08:0x3fcd9770 0x4200b19a:0x3fcd97a0 0x420078dd:0x3fcd97c0 0x420075b3:0x3fcd9800 0x4205ae08: tflite::MicroGraph::InvokeSubgraph(int) at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/tensorflow/lite/micro/micro_graph.cc:178

0x4200b19a: tflite::MicroInterpreter::Invoke() at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/tensorflow/lite/micro/micro_interpreter.cc:284

0x420078dd: loop at E:/esp32work/tflite-micro-esp-examples/examples/person_detection/main/main_functions.cc:151

0x420075b3: tf_main() at E:/esp32work/tflite-micro-esp-examples/examples/person_detection/main/main.cc:31 (discriminator 1)


It happened after the model invoked for 3 times.
vikramdattu commented 1 year ago

Hi @jiwenfei I have done changes in the following patch. Please check if you've done the similar: add_unsupported_ops.patch

The code was running fine for multiple iterations. I am using IDF branch release/v4.4. Also, if the issue persists, would you try the model on tflite-micro - x86?

The crash above seems something which you should raise on tflite-micro site directly as it looks to be coming from common code. Would you please raise the issue there? This would yield some quick fix.

jiwenfei commented 1 year ago

Hi @jiwenfei I missed your last message! Do you still face the issues? By the way, the gather_nd issue is now fixed on the latest sync. So, you may want to update to it by doing git pull --rebase and continue your development. @vikramdattu No,But new issue:

Core  0 register dump:
PC      : 0x4205ae0b  PS      : 0x00060e30  A0      : 0x8200b19d  A1      : 0x3fcd9770
0x4205ae0b: tflite::MicroGraph::InvokeSubgraph(int) at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/tensorflow/lite/micro/micro_graph.cc:178

A2      : 0x00000000  A3      : 0x00000000  A4      : 0x00000000  A5      : 0x3fcd8360
A6      : 0x00000000  A7      : 0x3fca2320  A8      : 0x8205ae05  A9      : 0x073f4bc0
A10     : 0x3dad5500  A11     : 0xe227f4f1  A12     : 0x3d8d8980  A13     : 0x00000001
A14     : 0x00000000  A15     : 0x00000000  SAR     : 0x00000003  EXCCAUSE: 0x0000001c
EXCVADDR: 0x073f4bcc  LBEG    : 0x4200e92e  LEND    : 0x4200e93a  LCOUNT  : 0x00000000
0x4200e92e: tflite::(anonymous namespace)::Eval(TfLiteContext*, TfLiteNode*) at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/tensorflow/lite/micro/kernels/cast.cc:50
 (inlined by) transform<int const*, float*, tflite::(anonymous namespace)::copyCast(const FromT*, ToT*, int) [with FromT = int; ToT = float]::<lambda(int)> > at d:\espressif\tools\xtensa-esp32s3-elf\esp-2021r2-patch5-8.4.0\xtensa-esp32s3-elf\xtensa-esp32s3-elf\include\c++\8.4.0\bits/stl_algo.h:4304
 (inlined by) copyCast<int, float> at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/tensorflow/lite/micro/kernels/cast.cc:49
 (inlined by) copyToTensor<int> at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/tensorflow/lite/micro/kernels/cast.cc:67
 (inlined by) Eval at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/tensorflow/lite/micro/kernels/cast.cc:92

0x4200e93a: tflite::(anonymous namespace)::Eval(TfLiteContext*, TfLiteNode*) at d:\espressif\tools\xtensa-esp32s3-elf\esp-2021r2-patch5-8.4.0\xtensa-esp32s3-elf\xtensa-esp32s3-elf\include\c++\8.4.0\bits/stl_algo.h:4303
 (inlined by) copyCast<int, float> at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/tensorflow/lite/micro/kernels/cast.cc:49
 (inlined by) copyToTensor<int> at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/tensorflow/lite/micro/kernels/cast.cc:67
 (inlined by) Eval at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/tensorflow/lite/micro/kernels/cast.cc:92

Backtrace: 0x4205ae08:0x3fcd9770 0x4200b19a:0x3fcd97a0 0x420078dd:0x3fcd97c0 0x420075b3:0x3fcd9800
0x4205ae08: tflite::MicroGraph::InvokeSubgraph(int) at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/tensorflow/lite/micro/micro_graph.cc:178

0x4200b19a: tflite::MicroInterpreter::Invoke() at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/tensorflow/lite/micro/micro_interpreter.cc:284

0x420078dd: loop at E:/esp32work/tflite-micro-esp-examples/examples/person_detection/main/main_functions.cc:151

0x420075b3: tf_main() at E:/esp32work/tflite-micro-esp-examples/examples/person_detection/main/main.cc:31 (discriminator 1)

It happened after the model invoked for 3 times.

@vikramdattu The last problem is caused by drawing pictures, like this ``` fb_gfx_fillRect (camera_fb_t *fb, int32_t x, int32_t y, int32_t w, int32_t h, uint32_t color);

vikramdattu commented 1 year ago

@jiwenfei that could mean that the camera_fb frame is smaller in size that that of width height provided to the function and it is trying to access outside buffer boundary. Did you fix this now? Please check on this.

jiwenfei commented 1 year ago

The result which run on colab coded by python is different from that run in esp32s3. And I run the code in esp32s3 as follow: 1.change an image to hpp file

import cv2
import numpy
import tensorflow as tf
image_path='/content/data/tang/1678084611419.jpg'
image = tf.io.read_file(image_path)
image = tf.compat.v1.image.decode_jpeg(image )
image=image.numpy()
h,w,c=image.shape
with open('images.hpp', 'w') as file:
        file.write('#pragma once\n'
                   '#include <stdint.h>\n\n'
                   f'#define IMAGE_HEIGHT {h}\n'
                   f'#define IMAGE_WIDTH {w}\n'
                   f'#define IMAGE_CHANNEL {c}\n\n'
                   'const static uint8_t IMAGE_ELEMENT[] = {\n')

        image = numpy.reshape(image, (-1,))
        for i, element in enumerate(image[:-1], 1):
            if i == 1:
                file.write('    ')

            file.write(f'{element}, ')

            if i % 32 == 0:
                file.write('\n    ')
        file.write(f'{image[-1]}')
        file.write('};\n')
  1. run tf lite infer at esp32 by using the image hpp

    input = interpreter->input(0);
    printf("input->dims->size=%d,input->dims->data={%d,%d}\n",input->dims->size,
        input->dims->data[0],input->dims->data[1]);
    printf("check success!\n");
    
    for (int i = 0; i < 192 * 192*3; i++) {
    input->data.i32[i] = IMAGE_ELEMENT[i];
    }
    printf("input data init success\n");
    start_time = esp_timer_get_time();
    if (kTfLiteOk != interpreter->Invoke()) {
    printf("Invoke failed.\n");
    
    }
    total_time = (esp_timer_get_time() - start_time);
    printf("Invoke success.time  %lld\n",total_time/1000);
    TfLiteTensor* output = interpreter->output(0);
    printf("output->dims->size=%d,output->dims->data={%d,%d}\n",input->dims->size,
        output->dims->data[0],output->dims->data[1]);
    
    for(uint8_t i=0;i<17*3;i+=3){
    int x = output->data.int8[i];
    int y = output->data.int8[i+1];
    printf("point[%d],x=%d,y=%d,score=%d\n",i/3,(x),(y),( output->data.int8[i+2]));
    
    }

    the python code is

    
    import tensorflow as tf
    import cv2
    import matplotlib.pyplot as plt
    interpreter = tf.lite.Interpreter(model_path="/content/model.tflite")
    interpreter.allocate_tensors()
    input_details = interpreter.get_input_details()
    output_details = interpreter.get_output_details()

image_path='/content/data/tang/1678084611419.jpg' image = tf.io.read_file(image_path) image = tf.compat.v1.image.decode_jpeg(image)

image = tf.expand_dims(image, axis=0)

Resize and pad the image to keep the aspect ratio and fit the expected size.

image = tf.image.resize_with_pad(image, 192, 192)

TF Lite format expects tensor type of float32.

input_image = tf.cast(image, dtype=tf.int32) abcT=input_image.numpy()

interpreter.set_tensor(input_details[0]['index'],abcT) interpreter.invoke() keypoints_with_scores = interpreter.get_tensor(output_details[0]['index']) print(keypoints_with_scores)

3. the result in esp32 is:

input data init success Invoke success.time 2866 output->dims->size=4,output->dims->data={1,1} point[0],x=-26,y=-50,score=-63 point[1],x=-33,y=-51,score=-54 point[2],x=-31,y=-50,score=-45 point[3],x=-34,y=-48,score=-54 point[4],x=-35,y=-47,score=-45 point[5],x=-27,y=-39,score=-45 point[6],x=-32,y=-29,score=-96 point[7],x=12,y=-37,score=-45 point[8],x=6,y=-34,score=-54 point[9],x=19,y=-52,score=-96 point[10],x=22,y=-29,score=-96 point[11],x=3,y=21,score=-23 point[12],x=-2,y=31,score=-34 point[13],x=12,y=75,score=-1 point[14],x=32,y=43,score=-85 point[15],x=57,y=64,score=-72 point[16],x=53,y=59,score=-54

and the result in colab is

[[[[-32 -49 -54] [-33 -51 -34] [-36 -51 -45] [-36 -49 -63] [-36 -46 -45] [-24 -37 -54] [-33 -27 -45] [ 12 -44 -45] [ 5 -29 -54] [ 35 -26 -23] [ 28 -30 -91] [ 8 13 -45] [ -4 28 -23] [ 42 49 -54] [ 9 62 -23] [ 57 63 -79] [ 52 61 -45]]]]

vikramdattu commented 1 year ago

Hi @jiwenfei

  1. Does disabling esp_nn give the same results? // you can do this via menuconfig > ESP_NN
  2. Can you confirm the model input is same in case of both collar and ESP32? Below steps looks wrong to me:
    for (int i = 0; i < 192 * 192*3; i++) {
    input->data.i32[i] = IMAGE_ELEMENT[i];
    }

    I think the input to be fed is expected to i8[]? Also, please check if you need quantisation on input. ( If yes, something like input->data.i8[i] = IMAGE_ELEMENT[I] - 128; will fix it)

jiwenfei commented 1 year ago

@jiwenfei that could mean that the camera_fb frame is smaller in size that that of width height provided to the function and it is trying to access outside buffer boundary. Did you fix this now? Please check on this.

Yes, I have modified code that ,It because of out of bound.

jiwenfei commented 1 year ago

While i disable esp-nn, like this

image

some errors have be occurred

 (1441) camera_httpd: Starting stream server on port: '81'
model load success !!
array init success !!
interpreter init success !!
tensor init success, time is 101

input->dims->size=4,input->dims->data={1,192}
check success!
2
input data init success
E (6271) task_wdt: Task watchdog got triggered. The following tasks did not reset the watchdog in time:
E (6271) task_wdt:  - IDLE (CPU 0)
E (6271) task_wdt: Tasks currently running:
E (6271) task_wdt: CPU 0: tf_main
E (6271) task_wdt: CPU 1: IDLE
E (6271) task_wdt: Print CPU 0 (current core) backtrace

Backtrace: 0x4206C3E2:0x3FC9ADC0 0x4037AE15:0x3FC9ADF0 0x42063216:0x3FCDB980 0x420506FE:0x3FCDBA30 0x4205A41E:0x3FCDBDD0 0x4200AD36:0x3FCDBE00 0x42007613:0x3FCDBE20 0x4200741C:0x3FCDD460
0x4206c3e2: task_wdt_isr at D:/Espressif/frameworks/esp-idf-v4.4.3/components/esp_system/task_wdt.c:183 (discriminator 3)

0x4037ae15: _xt_lowint1 at D:/Espressif/frameworks/esp-idf-v4.4.3/components/freertos/port/xtensa/xtensa_vectors.S:1111

0x42063216: esp_nn_conv_s8_ansi at E:/esp32work/tflite-micro-esp-examples/components/esp-nn/src/convolution/esp_nn_conv_ansi.c:163 (discriminator 3)

0x420506fe: tflite::(anonymous namespace)::Eval(TfLiteContext*, TfLiteNode*) at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/tensorflow/lite/micro/kernels/esp_nn/conv.cc:231
 (inlined by) Eval at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/tensorflow/lite/micro/kernels/esp_nn/conv.cc:294

0x4205a41e: tflite::MicroGraph::InvokeSubgraph(int) at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/tensorflow/lite/micro/micro_graph.cc:172

0x4200ad36: tflite::MicroInterpreter::Invoke() at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/tensorflow/lite/micro/micro_interpreter.cc:284

0x42007613: setup at E:/esp32work/tflite-micro-esp-examples/examples/person_detection/main/main_functions.cc:101

0x4200741c: tf_main() at E:/esp32work/tflite-micro-esp-examples/examples/person_detection/main/main.cc:29

E (6271) task_wdt: Print CPU 1 backtrace

Backtrace: 0x4037CA75:0x3FC9B3D0 0x4037AE15:0x3FC9B3F0 0x400559DD:0x3FCBCB80 |<-CORRUPTED
0x4037ca75: esp_crosscore_isr at D:/Espressif/frameworks/esp-idf-v4.4.3/components/esp_system/crosscore_int.c:92

0x4037ae15: _xt_lowint1 at D:/Espressif/frameworks/esp-idf-v4.4.3/components/freertos/port/xtensa/xtensa_vectors.S:1111

E (11271) task_wdt: Task watchdog got triggered. The following tasks did not reset the watchdog in time:
E (11271) task_wdt:  - IDLE (CPU 0)
E (11271) task_wdt: Tasks currently running:
E (11271) task_wdt: CPU 0: tf_main
E (11271) task_wdt: CPU 1: IDLE
E (11271) task_wdt: Print CPU 0 (current core) backtrace

Backtrace: 0x4206C3E2:0x3FC9ADC0 0x4037AE15:0x3FC9ADF0 0x4206321B:0x3FCDB980 0x420506FE:0x3FCDBA30 0x4205A41E:0x3FCDBDD0 0x4200AD36:0x3FCDBE00 0x42007613:0x3FCDBE20 0x4200741C:0x3FCDD460
0x4206c3e2: task_wdt_isr at D:/Espressif/frameworks/esp-idf-v4.4.3/components/esp_system/task_wdt.c:183 (discriminator 3)

0x4037ae15: _xt_lowint1 at D:/Espressif/frameworks/esp-idf-v4.4.3/components/freertos/port/xtensa/xtensa_vectors.S:1111

0x4206321b: esp_nn_conv_s8_ansi at E:/esp32work/tflite-micro-esp-examples/components/esp-nn/src/convolution/esp_nn_conv_ansi.c:164 (discriminator 3)

0x420506fe: tflite::(anonymous namespace)::Eval(TfLiteContext*, TfLiteNode*) at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/tensorflow/lite/micro/kernels/esp_nn/conv.cc:231
 (inlined by) Eval at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/tensorflow/lite/micro/kernels/esp_nn/conv.cc:294

0x4205a41e: tflite::MicroGraph::InvokeSubgraph(int) at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/tensorflow/lite/micro/micro_graph.cc:172

0x4200ad36: tflite::MicroInterpreter::Invoke() at E:/esp32work/tflite-micro-esp-examples/components/tflite-lib/tensorflow/lite/micro/micro_interpreter.cc:284

0x42007613: setup at E:/esp32work/tflite-micro-esp-examples/examples/person_detection/main/main_functions.cc:101

0x4200741c: tf_main() at E:/esp32work/tflite-micro-esp-examples/examples/person_detection/main/main.cc:29

E (11271) task_wdt: Print CPU 1 backtrace

Backtrace: 0x4037CA75:0x3FC9B3D0 0x4037AE15:0x3FC9B3F0 0x400559DD:0x3FCBCB80 |<-CORRUPTED
0x4037ca75: esp_crosscore_isr at D:/Espressif/frameworks/esp-idf-v4.4.3/components/esp_system/crosscore_int.c:92

0x4037ae15: _xt_lowint1 at D:/Espressif/frameworks/esp-idf-v4.4.3/components/freertos/port/xtensa/xtensa_vectors.S:1111
vikramdattu commented 1 year ago

Hey, @jiwenfei this is because unoptimised kernels take huge time to process and hence task watchdog timer which is by default set to 5 seconds gets triggered. You can either ignore that or increase FreeRTOS task watchdog time to a higher number, say 15 seconds.

vikramdattu commented 1 year ago

Hello @jiwenfei are you still looking into the issue or are willing to test the same? Please apply patch from following comment on ESP-NN and check if mismatch issue goes away: https://github.com/espressif/esp-nn/issues/5#issuecomment-1589070024

jiwenfei commented 1 year ago

Because of low image quality and much work need to do for image quality correction, so we switched to a new solution by using rv1126 board

vikramdattu commented 1 year ago

@jiwenfei thanks for the heads up. For my reference, can you please share information about which camera did you use with ESP32-S3 and in which mode? (Resolution and the JPEG quality used, if images were captures as JPEG).

jiwenfei commented 1 year ago

ov2640, 192*192 ,rgb565