espressif / esp-adf

Espressif Audio Development Framework
Other
1.5k stars 670 forks source link

Probleem recording after speech command recognition (AUD-4680) #1023

Open troedssl opened 1 year ago

troedssl commented 1 year ago

Environment

Problem Description

Hey, I want to combine the WWE Speech recognition and recording_to_sdcard example and came across a problem with the recording part. My idea is that after the wakeword and a speech command(command 32) is recogniced the recording example should start and save the 10 second recording on the SD-Card. It all works fine and the recording is saved on the SD-Card, but when i replay the audio on my laptop it is distorted.

Expected Behavior

To record audio in normla quality

Actual Behavior

The recorded audio is distorted

Code to Reproduce This Issue

Source and header files here: https://gist.github.com/troedssl/6c7cb0666c518d3cc39ccedf5a58f2a1

Output in the serial monitor

I tested some things, to see if I can find the error myself and found that the AFE_SR: ERROR! rb_out slow!!! warning doesn't influence the recording quality.

I (25) boot: ESP-IDF v4.4.4-448-gfd5e03b221-dirty 2nd stage bootloader
I (25) boot: compile time 12:40:35
I (25) boot: chip revision: v0.1
I (29) qio_mode: Enabling default flash chip QIO
I (34) boot.esp32s3: Boot SPI Speed : 80MHz
I (39) boot.esp32s3: SPI Mode       : QIO
I (43) boot.esp32s3: SPI Flash Size : 8MB
I (48) boot: Enabling RNG early entropy source...
I (54) boot: Partition Table:
I (57) boot: ## Label            Usage          Type ST Offset   Length
I (64) boot:  0 nvs              WiFi data        01 02 00009000 00004000
I (72) boot:  1 otadata          OTA data         01 00 0000d000 00002000
I (79) boot:  2 phy_init         RF data          01 01 0000f000 00001000
I (87) boot:  3 ota_0            OTA app          00 10 00010000 00290000
I (94) boot:  4 model            Unknown data     01 82 002a0000 0040e000
I (102) boot:  5 flash_tone       Unknown data     01 27 006ae000 00032000
I (109) boot: End of partition table
I (114) boot: No factory image, trying OTA 0
I (119) esp_image: segment 0: paddr=00010020 vaddr=3c0b0020 size=40910h (264464) map
I (167) esp_image: segment 1: paddr=00050938 vaddr=3fca4520 size=04da0h ( 19872) load
I (171) esp_image: segment 2: paddr=000556e0 vaddr=40378000 size=0a938h ( 43320) load
I (181) esp_image: segment 3: paddr=00060020 vaddr=42000020 size=a4998h (674200) map
I (284) esp_image: segment 4: paddr=001049c0 vaddr=40382938 size=11be8h ( 72680) load
I (309) boot: Loaded app from partition at offset 0x10000
I (339) boot: Set actual ota_seq=1 in otadata[0]
I (339) boot: Disabling RNG early entropy source...
I (350) opi psram: vendor id : 0x0d (AP)
I (350) opi psram: dev id    : 0x02 (generation 3)
I (350) opi psram: density   : 0x03 (64 Mbit)
I (353) opi psram: good-die  : 0x01 (Pass)
I (358) opi psram: Latency   : 0x01 (Fixed)
I (363) opi psram: VCC       : 0x01 (3V)
I (368) opi psram: SRF       : 0x01 (Fast Refresh)
I (373) opi psram: BurstType : 0x01 (Hybrid Wrap)
I (378) opi psram: BurstLen  : 0x01 (32 Byte)
I (383) opi psram: Readlatency  : 0x02 (10 cycles@Fixed)
I (389) opi psram: DriveStrength: 0x00 (1/1)
I (395) spiram: Found 64MBit SPI RAM device
I (399) spiram: SPI RAM mode: sram 80m
I (404) spiram: PSRAM initialized, cache is in normal (1-core) mode.
I (411) cpu_start: Pro cpu up.
I (414) cpu_start: Starting app cpu, entry point is 0x403796e8
0x403796e8: call_start_cpu1 at C:/Users/troed/esp/esp-idf-new/esp-idf/components/esp_system/port/cpu_start.c:147

I (0) cpu_start: App cpu up.
I (706) spiram: SPI SRAM memory test OK
I (715) cpu_start: Pro cpu start user code
I (715) cpu_start: cpu freq: 240000000
I (715) cpu_start: Application information:
I (718) cpu_start: Project name:     example_wwe
I (723) cpu_start: App version:      v2.5-38-g57282dd-dirty
I (730) cpu_start: Compile time:     Jun 20 2023 12:40:15
I (736) cpu_start: ELF file SHA256:  9242db0898e34651...
I (742) cpu_start: ESP-IDF:          v4.4.4-448-gfd5e03b221-dirty
I (748) cpu_start: Min chip rev:     v0.0
I (753) cpu_start: Max chip rev:     v0.99
I (758) cpu_start: Chip rev:         v0.1
I (763) heap_init: Initializing. RAM available for dynamic allocation:
I (770) heap_init: At 3FCAB550 len 0003E1C0 (248 KiB): D/IRAM
I (776) heap_init: At 3FCE9710 len 00005724 (21 KiB): STACK/DIRAM
I (783) heap_init: At 600FE000 len 00002000 (8 KiB): RTCRAM
I (789) spiram: Adding pool of 8192K of external SPI memory to heap allocator
I (798) spi_flash: detected chip: gd
I (802) spi_flash: flash io: qio
W (805) spi_flash: Detected size(16384k) larger than the size in the binary image header(8192k). Using the size in the binary image header.
I (819) sleep: Configure to isolate all GPIO pins in sleep state
I (825) sleep: Enable automatic switching of GPIO sleep configuration
I (833) cpu_start: Starting scheduler on PRO CPU.
I (0) cpu_start: Starting scheduler on APP CPU.
I (853) spiram: Reserving pool of 32K of internal memory for DMA/internal allocations
I (863) DRV8311: ES8311 in Slave mode
I (873) gpio: GPIO[48]| InputEn: 0| OutputEn: 1| OpenDrain: 0| Pullup: 0| Pulldown: 0| Intr:0
I (883) ES7210: ES7210 in Slave mode
I (893) ES7210: Enable ES7210_INPUT_MIC1
I (893) ES7210: Enable ES7210_INPUT_MIC2
I (893) ES7210: Enable ES7210_INPUT_MIC3
W (903) ES7210: Enable TDM mode. ES7210_SDP_INTERFACE2_REG12: 2
I (903) ES7210: config fmt 60
W (913) AUDIO_BOARD: The board has already been initialized!

----------------------------- ESP Audio Platform -----------------------------
|                                                                            |
|                       ESP_AUDIO-v1.7.1-4717e99-42c05f5                     |
|                     Compile date: Aug 24 2022-06:32:45                     |
------------------------------------------------------------------------------
I (953) ESP32_S3_KORVO_2: I2S0, MCLK output by GPIO0
I (953) wwe_example: Func:setup_player, Line:144, MEM Total:8595107 Bytes, Inter:236891 Bytes, Dram:236891 Bytes

I (963) wwe_example: esp_audio instance is:0x3d801af4

E (973) I2S: register I2S object to platform failed
I (973) ESP32_S3_KORVO_2: I2S0, MCLK output by GPIO0
I (983) wwe_example: Recorder has been created
I (983) MODEL_LOADER:
Initializing models from SPIFFS, partition label: model

I (1243) MODEL_LOADER: Partition size: total: 3900791, used: 2490171

I (1453) AFE_SR: afe interface for speech recognition

I (1453) AFE_SR: AFE version: SR_V220727

I (1453) AFE_SR: Initial auido front-end, total channel: 3, mic num: 2, ref num: 1

I (1453) AFE_SR: aec_init: 1, se_init: 1, vad_init: 1

I (1463) AFE_SR: wakenet_init: 1

model_name: wn9_alexa model_data: /srmodel/wn9_alexa/wn9_data
, tigger:v3, mode:2, p:0, (Nov  3 2022 11:49:16).625_0.645
I (2333) AFE_SR: wake num: 3, mode: 0, (Nov  3 2022 11:49:26)

Quantized8 Multinet5: MN5Q8_v2_english_8_0.9_0.90, (Nov  3 2022 11:49:19)
esp_mn_commands_update_from_sdkconfig
I (5493) MN_COMMAND: ---------------------SPEECH COMMANDS---------------------
I (5503) MN_COMMAND: Command ID0, phrase ID0: TfL Mm c qbK
I (5513) MN_COMMAND: Command ID1, phrase ID1: Sgl c Sel
I (5513) MN_COMMAND: Command ID2, phrase ID2: PLd NoZ paNcL
I (5523) MN_COMMAND: Command ID3, phrase ID3: TkN nN Mi StNDBnKS
I (5533) MN_COMMAND: Command ID4, phrase ID4: TkN eF Mi StNDBnKS
I (5533) MN_COMMAND: Command ID5, phrase ID5: hicST VnLYoM
I (5543) MN_COMMAND: Command ID6, phrase ID6: LbcST VnLYoM
I (5543) MN_COMMAND: Command ID7, phrase ID7: gNKRmS jc VnLYoM
I (5553) MN_COMMAND: Command ID8, phrase ID8: DgKRmS jc VnLYoM
I (5563) MN_COMMAND: Command ID9, phrase ID9: TkN nN jc TmVm
I (5563) MN_COMMAND: Command ID10, phrase ID10: TkN eF jc TmVm
I (5573) MN_COMMAND: Command ID11, phrase ID11: MdK Mm c Tm
I (5583) MN_COMMAND: Command ID12, phrase ID12: MdK Mm c KnFm
I (5583) MN_COMMAND: Command ID13, phrase ID13: TkN nN jc LiT
I (5593) MN_COMMAND: Command ID14, phrase ID14: TkN eF jc LiT
I (5603) MN_COMMAND: Command ID15, phrase ID15: pdNq jc KcLk To RfD
I (5603) MN_COMMAND: Command ID16, phrase ID16: pdNq jc KcLk To GRmN
I (5613) MN_COMMAND: Command ID17, phrase ID17: TkN nN eL jc LiTS
I (5623) MN_COMMAND: Command ID18, phrase ID18: TkN eF eL jc LiTS
I (5623) MN_COMMAND: Command ID19, phrase ID19: TkN nN jc fR KcNDgscNk
I (5633) MN_COMMAND: Command ID20, phrase ID20: TkN eF jc fR KcNDgscNk
I (5643) MN_COMMAND: Command ID21, phrase ID21: SfT jc TfMPRcpk To SgKSTmN DgGRmZ
I (5653) MN_COMMAND: Command ID22, phrase ID22: SfT jc TfMPRcpk To SfVcNTmN DgGRmZ
I (5653) MN_COMMAND: Command ID23, phrase ID23: SfT jc TfMPRcpk To dTmN DgGRmZ
I (5663) MN_COMMAND: Command ID24, phrase ID24: SfT jc TfMPRcpk To NiNTmN DgGRmZ
I (5673) MN_COMMAND: Command ID25, phrase ID25: SfT jc TfMPRcpk To TWfNTm DgGRmZ
I (5683) MN_COMMAND: Command ID26, phrase ID26: SfT jc TfMPRcpk To TWfNTm WcN DgGRmZ
I (5693) MN_COMMAND: Command ID27, phrase ID27: SfT jc TfMPRcpk To TWfNTm To DgGRmZ
I (5693) MN_COMMAND: Command ID28, phrase ID28: SfT jc TfMPRcpk To TWfNTm vRm DgGRmZ
I (5703) MN_COMMAND: Command ID29, phrase ID29: SfT jc TfMPRcpk To TWfNTm FeR DgGRmZ
I (5713) MN_COMMAND: Command ID30, phrase ID30: SfT jc TfMPRcpk To TWfNTm FiV DgGRmZ
I (5723) MN_COMMAND: Command ID31, phrase ID31: SfT jc TfMPRcpk To TWfNTm SgKS DgGRmZ
I (5733) MN_COMMAND: Command ID32, phrase ID32: i haV c KWfSpcN
I (5733) MN_COMMAND: ---------------------------------------------------------

I (145433) wwe_example: rec_engine_cb - REC_EVENT_WAKEUP_START
I (146333) wwe_example: rec_engine_cb - REC_EVENT_VAD_START
W (146333) wwe_example: voice read begin
I (147213) wwe_example: rec_engine_cb - AUDIO_REC_COMMAND_DECT
W (147213) wwe_example: command 32
I (147213) RECORD_TO_SDCARD: [ 1 ] Mount sdcard
E (147213) gpio: gpio_install_isr_service(449): GPIO isr service already installed
I (147723) RECORD_TO_SDCARD: [ 2 ] Start codec chip
W (147723) AUDIO_BOARD: The board has already been initialized!
I (147733) RECORD_TO_SDCARD: [3.0] Create audio pipeline for recording
I (147733) RECORD_TO_SDCARD: [3.1] Create fatfs stream to write data to sdcard
I (147743) RECORD_TO_SDCARD: [3.2] Create i2s stream to read audio data from codec chip
E (147753) I2S: register I2S object to platform failed
I (147753) RECORD_TO_SDCARD: [3.3] Create audio encoder to handle data
WAV ENCODER
I (147763) RECORD_TO_SDCARD: [3.4] Register all elements to audio pipeline
I (147773) RECORD_TO_SDCARD: [3.5] Link it together [codec_chip]-->i2s_stream-->audio_encoder-->fatfs_stream-->[sdcard]
I (147783) RECORD_TO_SDCARD: [3.6] Set music info to fatfs
I (147783) RECORD_TO_SDCARD: [ * ] Save the recording info to the fatfs stream writer, sample_rates=44100, bits=16, ch=2
I (147803) RECORD_TO_SDCARD: [3.7] Set up  uri
I (147803) RECORD_TO_SDCARD: [ 4 ] Set up  event listener
I (147813) RECORD_TO_SDCARD: [4.1] Listening event from pipeline
I (147813) RECORD_TO_SDCARD: [4.2] Listening event from peripherals
I (147823) RECORD_TO_SDCARD: [ 5 ] Start audio_pipeline
I (147833) RECORD_TO_SDCARD: [ 6 ] Listen for all pipeline events, record for 10 Seconds
I (148873) RECORD_TO_SDCARD: [ * ] Recording ... 1
I (149873) RECORD_TO_SDCARD: [ * ] Recording ... 2
I (150873) RECORD_TO_SDCARD: [ * ] Recording ... 3
I (151873) RECORD_TO_SDCARD: [ * ] Recording ... 4
I (152873) RECORD_TO_SDCARD: [ * ] Recording ... 5
W (152973) AFE_SR: ERROR! rb_out slow!!!

W (153103) AFE_SR: ERROR! rb_out slow!!!

W (153233) AFE_SR: ERROR! rb_out slow!!!

W (153353) AFE_SR: ERROR! rb_out slow!!!

W (153483) AFE_SR: ERROR! rb_out slow!!!

W (153623) AFE_SR: ERROR! rb_out slow!!!

W (153753) AFE_SR: ERROR! rb_out slow!!!

I (153873) RECORD_TO_SDCARD: [ * ] Recording ... 6
W (153883) AFE_SR: ERROR! rb_out slow!!!

W (154003) AFE_SR: ERROR! rb_out slow!!!

W (154133) AFE_SR: ERROR! rb_out slow!!!

W (154263) AFE_SR: ERROR! rb_out slow!!!

W (154383) AFE_SR: ERROR! rb_out slow!!!

W (154513) AFE_SR: ERROR! rb_out slow!!!

W (154643) AFE_SR: ERROR! rb_out slow!!!

W (154773) AFE_SR: ERROR! rb_out slow!!!

I (154873) RECORD_TO_SDCARD: [ * ] Recording ... 7
W (154903) AFE_SR: ERROR! rb_out slow!!!

W (155023) AFE_SR: ERROR! rb_out slow!!!

W (155143) AFE_SR: ERROR! rb_out slow!!!

W (155273) AFE_SR: ERROR! rb_out slow!!!

W (155403) AFE_SR: ERROR! rb_out slow!!!

W (155533) AFE_SR: ERROR! rb_out slow!!!

W (155653) AFE_SR: ERROR! rb_out slow!!!

W (155783) AFE_SR: ERROR! rb_out slow!!!

I (155873) RECORD_TO_SDCARD: [ * ] Recording ... 8
W (155903) AFE_SR: ERROR! rb_out slow!!!

W (156033) AFE_SR: ERROR! rb_out slow!!!

W (156153) AFE_SR: ERROR! rb_out slow!!!

W (156283) AFE_SR: ERROR! rb_out slow!!!

W (156403) AFE_SR: ERROR! rb_out slow!!!

W (156533) AFE_SR: ERROR! rb_out slow!!!

W (156663) AFE_SR: ERROR! rb_out slow!!!

W (156783) AFE_SR: ERROR! rb_out slow!!!

I (156873) RECORD_TO_SDCARD: [ * ] Recording ... 9
W (156913) AFE_SR: ERROR! rb_out slow!!!

W (157033) AFE_SR: ERROR! rb_out slow!!!

W (157173) AFE_SR: ERROR! rb_out slow!!!

W (157293) AFE_SR: ERROR! rb_out slow!!!

W (157423) AFE_SR: ERROR! rb_out slow!!!

W (157553) AFE_SR: ERROR! rb_out slow!!!

W (157673) AFE_SR: ERROR! rb_out slow!!!

W (157803) AFE_SR: ERROR! rb_out slow!!!

I (157873) RECORD_TO_SDCARD: [ * ] Recording ... 10
W (157893) RECORD_TO_SDCARD: [ * ] Stop event received
I (157893) RECORD_TO_SDCARD: [ 7 ] Stop audio_pipeline
E (157893) AUDIO_ELEMENT: [wav] Element already stopped
E (157903) AUDIO_ELEMENT: [file] Element already stopped
W (157903) AUDIO_PIPELINE: There are no listener registered
W (157913) AUDIO_ELEMENT: [file] Element has not create when AUDIO_ELEMENT_TERMINATE
W (157923) AUDIO_ELEMENT: [i2s] Element has not create when AUDIO_ELEMENT_TERMINATE
W (157933) AUDIO_ELEMENT: [wav] Element has not create when AUDIO_ELEMENT_TERMINATE

Other Items If Possible

Any help would be appreciated!

jason-mao commented 10 months ago

Please check the similar issue https://github.com/espressif/esp-adf/issues/870