espressif / esp-adf

Espressif Audio Development Framework
Other
1.5k stars 670 forks source link

ESP32S3 does not work for TTS data file size more than 3MB (AUD-4957) #1094

Open mike-2020 opened 9 months ago

mike-2020 commented 9 months ago

Using example pipeline_tts_stream, I tried all 4 versions TTS data files. Only those files size less than 3MB work. Those files size more than 3MB, they do not work. The app keep rebooting, complaining panic without detail error message.

I am using ESP32S3, with 8M PSRM, 16M flash. Using master code of esp-idf and esp-adf of recent days.

TTS code is run in a thread. The stack size of the thread is 6KB.

How can I investigate this issue?

mike-2020 commented 8 months ago

Here is the booting logs when using esp_tts_voice_data_xiaoxin.dat. The system seems to be reset by WDT after the first 4 words are spoken out.

I (50) boot: ESP-IDF v5.1.1-dirty 2nd stage bootloader I (50) boot: compile time Oct 28 2023 17:00:43 I (50) boot: Multicore bootloader I (53) boot: chip revision: v0.2 I (57) qio_mode: Enabling QIO for flash chip WinBond I (63) boot.esp32s3: Boot SPI Speed : 80MHz I (68) boot.esp32s3: SPI Mode : QIO I (72) boot.esp32s3: SPI Flash Size : 16MB W (77) boot.esp32s3: PRO CPU has been reset by WDT. W (83) boot.esp32s3: APP CPU has been reset by WDT. I (88) boot: Enabling RNG early entropy source... I (94) boot: Partition Table: I (97) boot: ## Label Usage Type ST Offset Length I (105) boot: 0 nvs WiFi data 01 02 00009000 00004000 I (112) boot: 1 otadata OTA data 01 00 0000d000 00002000 I (120) boot: 2 phy_init RF data 01 01 0000f000 00001000 I (127) boot: 3 factory factory app 00 00 00010000 00400000 I (135) boot: 4 voice_data Unknown data 01 81 00410000 00400000 I (142) boot: 5 model Unknown data 01 82 00810000 00500000 I (150) boot: 6 disk Unknown data 01 81 00d10000 00200000 I (158) boot: End of partition table I (162) boot: Defaulting to factory image I (166) esp_image: segment 0: paddr=00010020 vaddr=3c040020 size=1acf8h (109816) map I (186) esp_image: segment 1: paddr=0002ad20 vaddr=3fc99c00 size=049b0h ( 18864) load I (189) esp_image: segment 2: paddr=0002f6d8 vaddr=40378000 size=00940h ( 2368) load I (192) esp_image: segment 3: paddr=00030020 vaddr=42000020 size=3aaa8h (240296) map I (225) esp_image: segment 4: paddr=0006aad0 vaddr=40378940 size=1127ch ( 70268) load I (243) boot: Loaded app from partition at offset 0x10000 I (243) boot: Disabling RNG early entropy source... I (255) cpu_start: Multicore app I (255) octal_psram: vendor id : 0x0d (AP) I (255) octal_psram: dev id : 0x02 (generation 3) I (258) octal_psram: density : 0x03 (64 Mbit) I (264) octal_psram: good-die : 0x01 (Pass) I (269) octal_psram: Latency : 0x01 (Fixed) I (274) octal_psram: VCC : 0x01 (3V) I (279) octal_psram: SRF : 0x01 (Fast Refresh) I (285) octal_psram: BurstType : 0x01 (Hybrid Wrap) I (291) octal_psram: BurstLen : 0x01 (32 Byte) I (297) octal_psram: Readlatency : 0x02 (10 cycles@Fixed) I (303) octal_psram: DriveStrength: 0x00 (1/1) I (308) MSPI Timing: PSRAM timing tuning index: 5 I (313) esp_psram: Found 8MB PSRAM device I (318) esp_psram: Speed: 80MHz I (339) mmu_psram: Instructions copied and mapped to SPIRAM I (347) mmu_psram: Read only data copied and mapped to SPIRAM I (347) cpu_start: Pro cpu up. I (347) cpu_start: Starting app cpu, entry point is 0x40379744 0x40379744: call_start_cpu1 at E:/SDK_Lib/esp-idf/components/esp_system/port/cpu_start.c:154

I (0) cpu_start: App cpu up. I (658) esp_psram: SPI SRAM memory test OK I (668) cpu_start: Pro cpu start user code I (668) cpu_start: cpu freq: 240000000 Hz I (668) cpu_start: Application information: I (671) cpu_start: Project name: SmartVehicle-ESP32S3 I (677) cpu_start: App version: 1.0.O I (682) cpu_start: Compile time: Oct 28 2023 20:09:12 I (688) cpu_start: ELF file SHA256: 9c748e4f7a3b0457... I (694) cpu_start: ESP-IDF: v5.1.1-dirty I (699) cpu_start: Min chip rev: v0.0 I (704) cpu_start: Max chip rev: v0.99 I (709) cpu_start: Chip rev: v0.2 I (713) heap_init: Initializing. RAM available for dynamic allocation: I (721) heap_init: At 3FC9F890 len 00049E80 (295 KiB): DRAM I (727) heap_init: At 3FCE9710 len 00005724 (21 KiB): STACK/DRAM I (734) heap_init: At 600FE010 len 00001FD8 (7 KiB): RTCRAM I (740) esp_psram: Adding pool of 7808K of PSRAM memory to heap allocator I (748) spi_flash: detected chip: winbond I (752) spi_flash: flash io: qio W (756) i2s(legacy): legacy i2s driver is deprecated, please migrate to use driver/i2s_std.h, driver/i2s_pdm.h or driver/i2s_tdm.h I (768) sleep: Configure to isolate all GPIO pins in sleep state I (775) sleep: Enable automatic switching of GPIO sleep configuration I (782) app_start: Starting scheduler on CPU0 I (787) app_start: Starting scheduler on CPU1 I (787) main_task: Started on CPU0 I (797) esp_psram: Reserving pool of 32K of internal memory for DMA/internal allocations I (807) main_task: Calling app_main() I (817) AUDIO_MGR: [1.0] Init Peripheral Set I (817) AUDIO_MGR: [2.0] Start codec chip I (827) ES8388_DRIVER: ----- ES8388 settings ----- I (827) gpio: GPIO[17]| InputEn: 0| OutputEn: 1| OpenDrain: 0| Pullup: 0| Pulldown: 0| Intr:0 I (837) ES8388_DRIVER: init,out:00, in:03 I (847) AUDIO_HAL: Codec mode is 3, Ctrl:1 W (847) ES8388_DRIVER: es8388_start default is mode:3 I (847) TTS_MAIN: [3.0] Create audio pipeline for playback I (857) TTS_MAIN: [3.1] Create tts stream to read data from chinese strings init voice set:template ESP Chinese TTS v1.7 (Sep 22 2022 14:35:13, 1) I (867) TTS_MAIN: [3.2] Create i2s stream to write data to codec chip I (877) TTS_MAIN: [3.4] Register all elements to audio pipeline I (887) TTS_MAIN: [3.5] Link it together [strings]-->tts_stream-->filter-->i2s_stream-->[codec_chip] I (897) AUDIO_PIPELINE: link el->rb, el:0x3c062778, tag:tts, rb:0x3c0631dc I (897) AUDIO_PIPELINE: link el->rb, el:0x3c062bf0, tag:filter, rb:0x3c065240 I (907) TTS_MAIN: [3.6] Set up uri (tts as tts_stream, and directly output is i2s) I (917) TTS_MAIN: [4.0] Set up event listener I (917) TTS_MAIN: [4.1] Listening event from all elements of pipeline I (927) AUDIO_THREAD: The tts task allocate stack on internal memory I (937) AUDIO_ELEMENT: [tts-0x3c062778] Element task created I (937) AUDIO_THREAD: The filter task allocate stack on external memory I (947) AUDIO_ELEMENT: [filter-0x3c062bf0] Element task created I (957) AUDIO_THREAD: The i2s task allocate stack on internal memory I (967) AUDIO_ELEMENT: [i2s-0x3c062a48] Element task created I (967) AUDIO_PIPELINE: Func:audio_pipeline_run, Line:359, MEM Total:8213036 Bytes, Inter:277271 Bytes, Dram:277271 Bytes

I (977) AUDIO_ELEMENT: [tts] AEL_MSG_CMD_RESUME,state:1 I (987) tts_parser: unicode:0x6b22 -> huan1 I (987) tts_parser: unicode:0x8fce -> ying2 I (997) AUDIO_ELEMENT: [filter] AEL_MSG_CMD_RESUME,state:1 I (1007) RSP_FILTER: sample rate of source data : 16000, channel of source data : 1, sample rate of destination data : 16000, channel of destination data : 2 I (1017) AUDIO_ELEMENT: [i2s] AEL_MSG_CMD_RESUME,state:1 I (1027) I2S_STREAM: AUDIO_STREAM_WRITER I (1027) AUDIO_PIPELINE: Pipeline started I (1037) TTS_MAIN: [6.0] Listen for all pipeline events I (1037) tts_parser: unicode:0x4f7f -> shi3 I (1047) tts_parser: unicode:0x7528 -> yong4 I (1047) tts_parser: unicode:0x4e50 -> le4 I (1057) tts_parser: unicode:0x946b -> xin1 I (1057) tts_parser: unicode:0x8bed -> yu3 I (1067) tts_parser: unicode:0x97f3 -> yin1 I (1067) tts_parser: unicode:0x5f00 -> kai1 I (1077) tts_parser: unicode:0x6e90 -> yuan2 I (1077) tts_parser: unicode:0x6846 -> kuang4 I (1087) tts_parser: unicode:0x67b6 -> jia4 W (1087) TTS_STREAM: 欢迎使用乐鑫语音开源框架 I (1207) APP_MAIN: ESP_WIFI_MODE_STA ESP-ROM:esp32s3-20210327 Build:Mar 27 2021 rst:0x8 (TG1WDT_SYS_RST),boot:0x28 (SPI_FAST_FLASH_BOOT) Saved PC:0x4200377b 0x4200377b: panic_handler at E:/SDK_Lib/esp-idf/components/esp_system/port/panic_handler.c:145 (discriminator 3)

yyjdelete commented 5 months ago
  1. menuconfig中修改Flash size到16M
  2. 分区表(partitions.csv)中voice_data分区的大小需要被增大到3890k(或者文件实际大小?)
  3. 修改tts_stream.c代码中硬编码的3 * 1024 * 1024part->size https://github.com/espressif/esp-adf/blob/7a7c66e63dade6fa7003b9d0d77b64c17bfeb064/components/audio_stream/tts_stream.c#L182

  1. Modify Flash size in menuconfig to 16M
  2. The size of the voice_data partition in the partition table (partitions.csv) needs to be increased to 3890k (or the actual size of the file?)
  3. Modify the hard-coded 3 * 1024 * 1024 in the tts_stream.c code to part->size

参考(References) https://github.com/espressif/esp-sr/blob/cc806480c093f1bf866068a70cd6255da62e8275/test_apps/esp-tts/partitions.csv https://github.com/espressif/esp-sr/blob/cc806480c093f1bf866068a70cd6255da62e8275/test_apps/esp-tts/main/test_chinese_tts.c#L72-L76