espressif / esp-adf

Espressif Audio Development Framework
Other
1.54k stars 676 forks source link

pipeline_raw_http example silence with ESP32-LyraT-Mini (AUD-4932) #1089

Closed patrickkeenan closed 2 months ago

patrickkeenan commented 12 months ago

Environment

Problem Description

Building the pipeline_raw_http example doesn't work. Things I've tried

  1. Applying the patch for xtask
  2. Setting the memory variables SPIRAM_BOOT_INIT and the other one mentioned
  3. Trying a different board (I have 2)
  4. Trying with and without headphones

Perhaps this is an issue with the out dated I2S library, but it's quite hard to know. I would assume these examples should work with the latest builds.

Expected Behavior

The wav file should have audio in it.

Actual Behavior

Python server receives the http request and its the right length, but its just silence.

Steps to Reproduce

  1. Get the same IDF and ADF versions
  2. Use the Lyra T Mini 1.2
  3. Try to build the pipeline_raw_http example

Debug Logs

=a60e0h (680160) map
ets Jul 29 2019 12:21:46

rst:0x1 (POWERON_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:1
load:0x3fff0030,len:7100
ho 0 tail 12 room 4
load:0x40078000,len:15604
load:0x40080400,len:4
load:0x40080404,len:3876
entry 0x4008064c
I (31) boot: ESP-IDF v5.1.1-439-gcb174b0fe1-dirty 2nd stage bootloader
I (31) boot: compile time Oct  9 2023 00:46:34
I (33) boot: Multicore bootloader
I (37) boot: chip revision: v3.0
I (41) boot.esp32: SPI Speed      : 80MHz
I (46) boot.esp32: SPI Mode       : DIO
I (50) boot.esp32: SPI Flash Size : 4MB
I (55) boot: Enabling RNG early entropy source...
I (60) boot: Partition Table:
I (64) boot: ## Label            Usage          Type ST Offset   Length
I (71) boot:  0 nvs              WiFi data        01 02 00009000 00006000
I (79) boot:  1 phy_init         RF data          01 01 0000f000 00001000
I (86) boot:  2 factory          factory app      00 00 00010000 00200000
I (94) boot: End of partition table
I (98) esp_image: segment 0: paddr=00010020 vaddr=3f400020 size=2c230h (180784) map
I (161) esp_image: segment 1: paddr=0003c258 vaddr=3ffb0000 size=0379ch ( 14236) load
I (166) esp_image: segment 2: paddr=0003f9fc vaddr=40080000 size=0061ch (  1564) load
I (167) esp_image: segment 3: paddr=00040020 vaddr=400d0020 size=a60e0h (680160) map
I (380) esp_image: segment 4: paddr=000e6108 vaddr=4008061c size=18968h (100712) load
I (428) boot: Loaded app from partition at offset 0x10000
I (428) boot: Disabling RNG early entropy source...
I (440) cpu_start: Multicore app
I (440) quad_psram: This chip is ESP32-D0WD
I (440) esp_psram: Found 8MB PSRAM device
I (442) esp_psram: Speed: 40MHz
I (446) esp_psram: PSRAM initialized, cache is in low/high (2-core) mode.
W (453) esp_psram: Virtual address not enough for PSRAM, map as much as we can. 4MB is mapped
I (463) cpu_start: Pro cpu up.
I (466) cpu_start: Starting app cpu, entry point is 0x400815fc
I (0) cpu_start: App cpu up.
I (1351) esp_psram: SPI SRAM memory test OK
I (1359) cpu_start: Pro cpu start user code
I (1359) cpu_start: cpu freq: 160000000 Hz
I (1359) cpu_start: Application information:
I (1362) cpu_start: Project name:     record_raw_http
I (1368) cpu_start: App version:      v2.6-23-gef058da
I (1373) cpu_start: Compile time:     Oct  9 2023 00:48:50
I (1380) cpu_start: ELF file SHA256:  4bc15b83f25328e6...
I (1386) cpu_start: ESP-IDF:          v5.1.1-439-gcb174b0fe1-dirty
I (1393) cpu_start: Min chip rev:     v0.0
I (1397) cpu_start: Max chip rev:     v3.99 
I (1402) cpu_start: Chip rev:         v3.0
I (1407) heap_init: Initializing. RAM available for dynamic allocation:
I (1414) heap_init: At 3FFAE6E0 len 00001920 (6 KiB): DRAM
I (1421) heap_init: At 3FFB8128 len 00027ED8 (159 KiB): DRAM
I (1427) heap_init: At 3FFE0440 len 00003AE0 (14 KiB): D/IRAM
I (1433) heap_init: At 3FFE4350 len 0001BCB0 (111 KiB): D/IRAM
I (1440) heap_init: At 40098F84 len 0000707C (28 KiB): IRAM
I (1446) esp_psram: Adding pool of 4096K of PSRAM memory to heap allocator
I (1455) spi_flash: detected chip: generic
I (1458) spi_flash: flash io: dio
W (1462) spi_flash: Detected size(8192k) larger than the size in the binary image header(4096k). Using the size in the binary image header.
W (1476) i2s(legacy): legacy i2s driver is deprecated, please migrate to use driver/i2s_std.h, driver/i2s_pdm.h or driver/i2s_tdm.h
W (1488) ADC: legacy driver is deprecated, please migrate to `esp_adc/adc_oneshot.h`
I (1497) app_start: Starting scheduler on CPU0
I (1502) app_start: Starting scheduler on CPU1
I (1502) main_task: Started on CPU0
I (1512) esp_psram: Reserving pool of 32K of internal memory for DMA/internal allocations
I (1512) main_task: Calling app_main()
I (1562) REC_RAW_HTTP: [ 1 ] Initialize Button Peripheral & Connect to wifi network
W (1962) PERIPH_WIFI: WiFi Event cb, Unhandle event_base:WIFI_EVENT, event_id:4
I (3462) REC_RAW_HTTP: [ 2 ] Start codec chip
W (3492) I2C_BUS: I2C bus has been already created, [port:0]
I (3502) REC_RAW_HTTP: [3.0] Create audio pipeline for recording
I (3502) REC_RAW_HTTP: [3.1] Create http stream to post data to server
I (3502) REC_RAW_HTTP: [3.2] Create i2s stream to read audio data from codec chip
I (3512) REC_RAW_HTTP: [3.3] Register all elements to audio pipeline
I (3512) REC_RAW_HTTP: [3.4] Link it together [codec_chip]-->i2s_stream->http_stream-->[http_server]
I (3532) REC_RAW_HTTP: [ 4 ] Press [Rec] button to record, Press [Mode] to exit
E (8102) REC_RAW_HTTP: [ * ] [Rec] input key event, resuming pipeline ...
W (8102) AUDIO_PIPELINE: Without stop, st:1
W (8102) AUDIO_PIPELINE: Without wait stop, st:1
W (8102) AUDIO_ELEMENT: [i2s] Element has not create when AUDIO_ELEMENT_TERMINATE
W (8112) AUDIO_ELEMENT: [http] Element has not create when AUDIO_ELEMENT_TERMINATE
I (8122) REC_RAW_HTTP: [ + ] HTTP client HTTP_STREAM_PRE_REQUEST, lenght=0

Total bytes written: 4096

Total bytes written: 8192

Total bytes written: 12288

Total bytes written: 16384

Total bytes written: 20480

Total bytes written: 24576

Total bytes written: 28672

Total bytes written: 32768

Total bytes written: 36864

Total bytes written: 40960

Total bytes written: 45056

Total bytes written: 49152

Total bytes written: 53248
E (9802) REC_RAW_HTTP: [ * ] [Rec] key released, stop pipeline ...

Total bytes written: 54000
I (9802) REC_RAW_HTTP: [ + ] HTTP client HTTP_STREAM_POST_REQUEST, write end chunked marker
I (9832) REC_RAW_HTTP: [ + ] HTTP client HTTP_STREAM_FINISH_REQUEST
W (9832) HTTP_CLIENT: esp_transport_read returned:-1 and errno:128 
I (9832) REC_RAW_HTTP: Got HTTP Response = File 20231009T044938Z_16000_16_1.wav was written, size 54000
E (20602) REC_RAW_HTTP: [ * ] [Rec] input key event, resuming pipeline ...
W (20602) AUDIO_ELEMENT: [i2s] Element already stopped
W (20602) AUDIO_ELEMENT: [http] Element already stopped
I (20612) REC_RAW_HTTP: [ + ] HTTP client HTTP_STREAM_PRE_REQUEST, lenght=0

Total bytes written: 4096

Total bytes written: 8192

Total bytes written: 12288

Total bytes written: 16384

Total bytes written: 20480

Total bytes written: 24576

Total bytes written: 28672

Total bytes written: 32768

Total bytes written: 36864

Total bytes written: 40960

Total bytes written: 45056

Total bytes written: 49152

Total bytes written: 53248

Total bytes written: 57344
E (22582) REC_RAW_HTTP: [ * ] [Rec] key released, stop pipeline ...

Total bytes written: 61200

Total bytes written: 64800
I (22642) REC_RAW_HTTP: [ + ] HTTP client HTTP_STREAM_POST_REQUEST, write end chunked marker
I (22662) REC_RAW_HTTP: [ + ] HTTP client HTTP_STREAM_FINISH_REQUEST
W (22662) HTTP_CLIENT: esp_transport_read returned:-1 and errno:128 
I (22662) REC_RAW_HTTP: Got HTTP Response = File 20231009T044951Z_16000_16_1.wav was written, size 64800
E (78282) REC_R
shootao commented 12 months ago

HI @patrickkeenan Based on the logs you provided, it appears that the issue is caused by the socket not being connected (errno is 128). To use this example, you need to ensure two things:

  1. You need to run server.py first.
  2. Ensure that the client device and the server are on the same local network.
patrickkeenan commented 12 months ago

Yes, I've ensured both of those things, also, the file is written on the server, so its connecting. Here are the server logs:

Serving HTTP on 192.168.86.22 port 8000 Do Post...... Audio information, sample rates: 16000, bits: 16, channel(s): 1 Total bytes received: 82800 192.168.86.59 - - [09/Oct/2023 09:46:07] "POST /upload HTTP/1.1" 200 -

shootao commented 11 months ago

Hi @patrickkeenan I'm using the same ADF and IDF for testing, and indeed, there is a “_HTTP_CLIENT: esp_transportread returned:-1 and errno:128” log out in the IDF. The correct approach in IDF should be as follows, but this is just an erroneous log. The server-side does have the correct audio recording data.

diff --git a/components/esp_http_client/esp_http_client.c b/components/esp_http_client/esp_http_client.c
index 59273915ec..b4e8a74c96 100644
--- a/components/esp_http_client/esp_http_client.c
+++ b/components/esp_http_client/esp_http_client.c
@@ -1160,8 +1160,9 @@ int esp_http_client_read(esp_http_client_handle_t client, char *buffer, int len)
                     http_parser_execute(client->parser, client->parser_settings, res_buffer->data, 0);
                     /* ...and lowering the message severity, as closed connection from server side is expected in chunked transport */
                     sev = ESP_LOG_DEBUG;
+                } else if (rlen != ERR_TCP_TRANSPORT_CONNECTION_CLOSED_BY_FIN) {
+                    ESP_LOG_LEVEL(sev, TAG, "esp_transport_read returned:%d and errno:%d ", rlen, errno);
                 }
-                ESP_LOG_LEVEL(sev, TAG, "esp_transport_read returned:%d and errno:%d ", rlen, errno);
             }

             if (rlen == ERR_TCP_TRANSPORT_CONNECTION_TIMEOUT) {
PrathamG commented 8 months ago

Hi! I'm having the same issue with the same board. I'm not sure but it might be something to do with the board itself? Even the google speech example returns an empty response, so it could be that the i2s_read is simply not sending proper audio data for LyraT-Mini-1.2

EDIT: I'm getting one more line of error when running the example.

E (29102) transport_base: poll_read select error 104, errno = Connection reset by peer, fd = 54 E (29102) HTTP_CLIENT: transport_read: error - 57347 | ERROR I (29112) REC_RAW_HTTP: Got HTTP Response = File 20240202T090357Z_16000_16_1.wav was written, size 126000

What does this error mean? I'm not able to find any documentation for it anywhere. Thanks

PrathamG commented 8 months ago

Okay I was able to solve the issue for the google cloud speech example. I'm fairly certain it's the same issue in this case too. Has to do with LyraT-mini's I2S configuration being different.

See here (https://github.com/espressif/esp-adf/issues/934#issuecomment-1923532342)