espressif / esp-hosted

Hosted Solution (Linux/MCU) with ESP32 (Wi-Fi + BT + BLE)
Other
658 stars 149 forks source link

NG sdio driver is not stable #362

Open hustzju opened 4 months ago

hustzju commented 4 months ago

We have successfully implemented the SDIO solution On my imx6ul board, and test is OK. but when produce first batch of circuit boards with new Wi-Fi mode, A new problem arises. esp32 driver could be loaded correctly,and connecting wifi works well,but When transmitting large amounts of data, SDIO wifi driver is not stable, sdio send data error and time out always occur,and then we receive the log "mmc1: card 0001 removed".

host log: host_log.txt esp32 log: esp32_output.log.txt please give me some advice, thanks.

mantriyogesh commented 4 months ago

It is very important that you use exact same code (git commit) at both ESP and host. Can you please confirm the git commit at both, ESP and Host?

hustzju commented 4 months ago

![Uploading git.jpg…]() Even though I download the GitHub asset ESP-Hosted-NG Release 1.0.2 again today, and don't modify any code. It is still the same problem..

mantriyogesh commented 4 months ago

The image attachment is failed, somehow.

In your https://github.com/espressif/esp-hosted/issues/362#issue-2233168094, the ESP log shows 1.0.1 and host log shows ESP was detected to be 1.0.2, which seems incorrect. Can you try both ESP and host on master?

mantriyogesh commented 4 months ago

In the earlier logs the system state looks not even reached to stable data path state

hustzju commented 4 months ago

Both 1.0.2,logs like this: host log: 102host.txt esp32 log: 102esp32.txt

hustzju commented 4 months ago

As we can see, it was normal at the beginning, then send data error, and sdio card removed

hustzju commented 4 months ago

In the earlier logs the system state looks not even reached to stable data path state

So what am I missing with porting to make system stable??

mantriyogesh commented 4 months ago

To seperate, there are three blocks in system,

  1. Transport (sdio)
  2. Wpa supplicant
  3. Data path

Is this a new yocto port you had been trying?

Transport related issues could be ruled out when running Transport test in tx and rx direction.

hustzju commented 4 months ago

I am trying iMX6UL yocto, and I think others have succeed to port on it in #137 . I run test process iperf3, Is there any difference between running Transport test and running it?

mantriyogesh commented 4 months ago

Yes iperf involves TCP/IP, wifi, and transport.

With raw throughout test above we would get early indication if there is any wrong before wifi & tcp-ip actually coming into picture.

Raw throughout also would let us know the max capability of transport as such (iperf max ceiling).

If this is new port on imx you might also would need to find the dependencies of wpa supplicant. Those recipies also would be needed to include. For example rfkill, etc.

I can dig which software we needed for imx8mm work on yocto, only if this is your new port.

If this is not a new port and it has already worked at least one time with all iperf rx and tx, we would need to see.this from different angle then.

Yocto is complicated system and getting softwares needed to be running is pre-requisite for ESP-Hosted to be running. But just in your case we also have imx8mm running fine. As it might be similar to imx6 we can just provide extra suggestions as we are able to, to get your yocto port up and running fast.

Also can you limit the sdio clock to say 5M? https://github.com/espressif/esp-hosted/blob/3c3fc1029b00ac46d98ab98495b7a817ddb5bc2d/esp_hosted_fg/host/linux/host_driver/esp32/sdio/esp_sdio.c#L744-L748

hustzju commented 4 months ago

I have tried to modify DTS, lower sdio clock to 5M, like this:

&usdhc2 {

address-cells = <1>;

#size-cells = <0>;                                                                                                                             
pinctrl-names = "default";                                                                                                          
pinctrl-0 = <&pinctrl_usdhc2_alt &pinctrl_wifi>;                                                                                                      
//non-removable;                                                                                                                    
keep-power-in-suspend;                                                                                                              
wakeup-source;        
no-1-8-v;
broken-cd;     
max-frequency = <5000000>;    
bus-width = <4>;                                                                                                     
status = "okay";     
esp32: esp32@1 {
reg = <1>;
compatible = "esp32_sdio";
gpios = <&gpio1 10 GPIO_ACTIVE_LOW>;
};                                                                                                               

};

Is it the same effect as modifying esp_sdio.c

mantriyogesh commented 4 months ago

It may work. You can confirm by running

sudo cat /sys/kernel/debug/mmc1/ios

With this change, raw throughout if works fine, we can move to higher layer.

On another note, I see the AP provided IP address, 192.168.5.4 But you are trying to connect to 192.168.1.16.

This is network address change (first one is 192.168.5.x other is 192.168.1.x). Can you please explain your network topology / routing used? I am trying to understand why the network addresses could be differing.

mantriyogesh commented 4 months ago

Also, is it possible to test with master at ESP and host?

@kapilkedawat , Does https://github.com/espressif/esp-idf/issues/6920 has any relation to

I (131936) FW_CMD: STA mic fail
hustzju commented 4 months ago

before I lower clock max-frequency to 5M, ios debug info : 5aa34486697e61227a2eb2214af0e8e after lower to 5M, clock debug into changed to 5M

hustzju commented 4 months ago

192.168.1.x, 192.168.5.x, They belong to the front and back level routers, ignore this.

hustzju commented 4 months ago

I have cross compiled the sdio driver with enable TEST_RAW_TP, but how to start raw test after loading drive by exec "insmod esp32_sdio.ko resetpin=44"?

mantriyogesh commented 4 months ago

on latest master, see https://github.com/espressif/esp-hosted/blob/3c3fc1029b00ac46d98ab98495b7a817ddb5bc2d/esp_hosted_ng/host/rpi_init.sh#L66

$RAW_TP_MODE would be 1 for host ---> ESP test $RAW_TP_MODE would be 2 for host <--- ESP test

hustzju commented 4 months ago

root@myd-y6ull14x14:~# insmod esp32_sdio.ko resetpin=44 raw_tp_mode=1 [ 31.270478] esp32_sdio: loading out-of-tree module taints kernel. root@myd-y6ull14x14:~# [ 32.860969] mmc1: queuing unknown CIS tuple 0x01 (3 bytes) [ 32.879513] mmc1: queuing unknown CIS tuple 0x1a (5 bytes) [ 32.889040] mmc1: queuing unknown CIS tuple 0x1b (8 bytes) [ 32.900116] mmc1: queuing unknown CIS tuple 0x80 (1 bytes) [ 32.905993] mmc1: queuing unknown CIS tuple 0x81 (1 bytes) [ 32.911664] mmc1: queuing unknown CIS tuple 0x82 (1 bytes) [ 32.923084] mmc1: queuing unknown CIS tuple 0x80 (1 bytes) [ 32.930165] mmc1: queuing unknown CIS tuple 0x81 (1 bytes) [ 32.937193] mmc1: queuing unknown CIS tuple 0x82 (1 bytes) [ 32.944260] mmc1: new SDIO card at address 0001 [ 32.960345] esp32_sdio: esp_probe: ESP network device detected [ 32.968825] esp32_sdio: get_firmware_data: Rx Pre ====== 0 [ 32.974565] esp32_sdio: get_firmware_data: Rx Pos ====== 0 [ 32.983565] esp32_sdio: get_firmware_data: Tx Pre ====== 0 [ 32.990973] esp32_sdio: get_firmware_data: Tx Pos ====== 10 [ 33.003967] esp32_sdio: probe of mmc1:0001:2 failed with error -22 [ 33.725320] esp32_sdio: process_esp_bootup_event: Received ESP bootup event [ 33.732368] esp32_sdio: process_event_esp_bootup: Bootup Event tag: 3 [ 33.743091] esp32_sdio: esp_validate_chipset: Chipset=ESP32 ID=00 detected over SDIO [ 33.751271] esp32_sdio: process_event_esp_bootup: Bootup Event tag: 0 [ 33.758036] esp32_sdio: process_event_esp_bootup: Bootup Event tag: 4 [ 33.764528] esp32_sdio: process_event_esp_bootup: Unsupported tag=4 in bootup event [ 33.772348] esp32_sdio: process_event_esp_bootup: Bootup Event tag: 1 [ 33.778931] esp32_sdio: process_fw_data: ESP chipset's last reset cause: [ 33.785809] esp32_sdio: print_reset_reason: POWERON_RESET [ 33.791255] esp32_sdio: check_esp_version: ESP Firmware version: 1.0.2 [ 33.818538] esp32_sdio: esp_reg_notifier: Driver init is ongoing [ 34.094904] esp32_sdio: init_bt: ESP Bluetooth init [ 34.106340] esp32_sdio: process_test_capabilities: start testing of Host->ESP raw throughput [ 35.045508] can-3v3: disabling [ 35.124915] esp32_sdio: log_raw_tp_stats_timer_cb: 0-1 sec 0 kbits/sec [ 35.124915] [ 36.164944] esp32_sdio: log_raw_tp_stats_timer_cb: 1-2 sec 0 kbits/sec [ 36.164944] [ 37.050608] esp32_sdio: tx_process: Failed to send data: -84 1536 1536 [ 37.085051] esp32_sdio: tx_process: Failed to send data: -84 1536 1536 [ 37.105071] esp32_sdio: tx_process: Failed to send data: -110 1536 1536 [ 37.111938] esp32_sdio: tx_process: Failed to send data: -110 1536 1536 [ 37.135167] esp32_sdio: tx_process: Failed to send data: -110 1536 1536 [ 37.155054] esp32_sdio: tx_process: Failed to send data: -110 1536 1536 [ 37.161922] esp32_sdio: tx_process: Failed to send data: -110 1536 1536 [ 37.185181] esp32_sdio: tx_process: Failed to send data: -110 1536 1536 [ 37.204959] esp32_sdio: log_raw_tp_stats_timer_cb: 2-3 sec 41735 kbits/sec [ 37.204959] [ 37.486033] mmc1: card 0001 removed [ 38.244851] esp32_sdio: log_raw_tp_stats_timer_cb: 3-4 sec 0 kbits/sec [ 38.244851] [ 39.124916] esp32_sdio: wait_and_decode_cmd_resp: Command[0x14] timed out [ 39.140475] esp32_sdio: cmd_init_raw_tp_task_timer: wait_and_decode_cmd_resp(priv, cmd_node) failure, ret: -22 [ 39.153221] esp32_sdio: esp_init_raw_tp: cmd_init_raw_tp_task_timer(adapter->priv[ESP_STA_NW_IF]) failure, ret: -22 [ 39.163945] esp32_sdio: print_capabilities: Capabilities: 0x1d. Features supported are: [ 39.172071] esp32_sdio: print_capabilities: WLAN on SDIO [ 39.177826] esp32_sdio: print_capabilities: BT/BLE [ 39.182977] esp32_sdio: print_capabilities: - HCI over SDIO [ 39.189635] esp32_sdio: print_capabilities: - BT/BLE dual mode [ 39.284849] esp32_sdio: log_raw_tp_stats_timer_cb: 4-5 sec 0 kbits/sec [ 39.284849] [ 40.324862] esp32_sdio: log_raw_tp_stats_timer_cb: 5-6 sec 0 kbits/sec [ 40.324862] [ 41.364861] esp32_sdio: log_raw_tp_stats_timer_cb: 6-7 sec 0 kbits/sec [ 41.364861] [ 42.404862] esp32_sdio: log_raw_tp_stats_timer_cb: 7-8 sec 0 kbits/sec [ 42.404862] [ 43.444863] esp32_sdio: log_raw_tp_stats_timer_cb: 8-9 sec 0 kbits/sec [ 43.444863] [ 44.484865] esp32_sdio: log_raw_tp_stats_timer_cb: 9-10 sec 0 kbits/sec [ 44.484865] [ 45.524863] esp32_sdio: log_raw_tp_stats_timer_cb: 10-11 sec 0 kbits/sec [ 45.524863] [ 46.564858] esp32_sdio: log_raw_tp_stats_timer_cb: 11-12 sec 0 kbits/sec [ 46.564858] [ 47.604863] esp32_sdio: log_raw_tp_stats_timer_cb: 12-13 sec 0 kbits/sec [ 47.604863] [ 48.644857] esp32_sdio: log_raw_tp_stats_timer_cb: 13-14 sec 0 kbits/sec [ 48.644857] [ 49.684858] esp32_sdio: log_raw_tp_stats_timer_cb: 14-15 sec 0 kbits/sec [ 49.684858] [ 50.724889] esp32_sdio: log_raw_tp_stats_timer_cb: 15-16 sec 0 kbits/sec [ 50.724889] [ 51.764876] esp32_sdio: log_raw_tp_stats_timer_cb: 16-17 sec 0 kbits/sec [ 51.764876] [ 52.804859] esp32_sdio: log_raw_tp_stats_timer_cb: 17-18 sec 0 kbits/sec [ 52.804859] [ 53.844871] esp32_sdio: log_raw_tp_stats_timer_cb: 18-19 sec 0 kbits/sec [ 53.844871] [ 54.884858] esp32_sdio: log_raw_tp_stats_timer_cb: 19-20 sec 0 kbits/sec [ 54.884858] [ 55.924863] esp32_sdio: log_raw_tp_stats_timer_cb: 20-21 sec 0 kbits/sec [ 55.924863] [ 56.964876] esp32_sdio: log_raw_tp_stats_timer_cb: 21-22 sec 0 kbits/sec [ 56.964876] [ 58.004864] esp32_sdio: log_raw_tp_stats_timer_cb: 22-23 sec 0 kbits/sec [ 58.004864] [ 59.044856] esp32_sdio: log_raw_tp_stats_timer_cb: 23-24 sec 0 kbits/sec [ 59.044856] [ 60.084860] esp32_sdio: log_raw_tp_stats_timer_cb: 24-25 sec 0 kbits/sec [ 60.084860] [ 61.124864] esp32_sdio: log_raw_tp_stats_timer_cb: 25-26 sec 0 kbits/sec [ 61.124864] [ 62.164856] esp32_sdio: log_raw_tp_stats_timer_cb: 26-27 sec 0 kbits/sec [ 62.164856] [ 63.204858] esp32_sdio: log_raw_tp_stats_timer_cb: 27-28 sec 0 kbits/sec [ 63.204858] [ 64.244866] esp32_sdio: log_raw_tp_stats_timer_cb: 28-29 sec 0 kbits/sec

hustzju commented 4 months ago

rst:0x1 (POWERON_RESET),boot:0x3f (SPI_FAST_FLASH_BOOT) configsip: 0, SPIWP:0xee clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00 mode:DIO, clock div:2 load:0x3fff0030,len:6680 load:0x40078000,len:14848 load:0x40080400,len:3792 entry 0x40080694 I (27) boot: ESP-IDF v4.4.1-584-g91dc99a3ce-dirty 2nd stage bootloader I (27) boot: compile time 12:09:04 I (28) boot: chip revision: 1 I (32) boot_comm: chip revision: 1, min. bootloader chip revision: 0 I (39) boot.esp32: SPI Speed : 40MHz I (43) boot.esp32: SPI Mode : DIO I (48) boot.esp32: SPI Flash Size : 4MB I (52) boot: Enabling RNG early entropy source... I (58) boot: Partition Table: I (61) boot: ## Label Usage Type ST Offset Length I (69) boot: 0 nvs WiFi data 01 02 00009000 00004000 I (76) boot: 1 otadata OTA data 01 00 0000d000 00002000 I (84) boot: 2 phy_init RF data 01 01 0000f000 00001000 I (91) boot: 3 factory factory app 00 00 00010000 00100000 I (99) boot: 4 ota_0 OTA app 00 10 00110000 00100000 I (106) boot: 5 ota_1 OTA app 00 11 00210000 00100000 I (114) boot: End of partition table I (118) boot: Defaulting to factory image I (123) boot_comm: chip revision: 1, min. application chip revision: 0 I (130) esp_image: segment 0: paddr=00010020 vaddr=3f400020 size=17950h ( 96592) map I (173) esp_image: segment 1: paddr=00027978 vaddr=3ffbdb60 size=05574h ( 21876) load I (182) esp_image: segment 2: paddr=0002cef4 vaddr=40080000 size=03124h ( 12580) load I (188) esp_image: segment 3: paddr=00030020 vaddr=400d0020 size=88d8ch (560524) map I (393) esp_image: segment 4: paddr=000b8db4 vaddr=40083124 size=1a43ch (107580) load I (437) esp_image: segment 5: paddr=000d31f8 vaddr=50000000 size=00010h ( 16) load I (452) boot: Loaded app from partition at offset 0x10000 I (452) boot: Disabling RNG early entropy source... I (464) cpu_start: Pro cpu up. I (464) cpu_start: Starting app cpu, entry point is 0x40081288 I (0) cpu_start: App cpu up. I (480) cpu_start: Pro cpu start user code I (480) cpu_start: cpu freq: 240000000 I (480) cpu_start: Application information: I (484) cpu_start: Project name: network_adapter I (490) cpu_start: App version: 2027593 I (495) cpu_start: Compile time: Dec 19 2022 12:08:55 I (501) cpu_start: ELF file SHA256: 76d958924cb99ea3... I (507) cpu_start: ESP-IDF: v4.4.1-584-g91dc99a3ce-dirty I (514) heap_init: Initializing. RAM available for dynamic allocation: I (521) heap_init: At 3FFAFF10 len 000000F0 (0 KiB): DRAM I (527) heap_init: At 3FFB6388 len 00001C78 (7 KiB): DRAM I (533) heap_init: At 3FFB9A20 len 00004108 (16 KiB): DRAM I (539) heap_init: At 3FFD1A40 len 0000E5C0 (57 KiB): DRAM I (546) heap_init: At 3FFE0440 len 00003AE0 (14 KiB): D/IRAM I (552) heap_init: At 3FFE4350 len 0001BCB0 (111 KiB): D/IRAM I (558) heap_init: At 4009D560 len 00002AA0 (10 KiB): IRAM I (566) spi_flash: detected chip: generic I (569) spi_flash: flash io: dio I (574) cpu_start: Starting scheduler on PRO CPU. I (0) cpu_start: Starting scheduler on APP CPU. I (588) stats: I (594) stats: ESP-Hosted Firmware version :: 1.0.2
I (602) stats: Transport used :: SDIO only
I (608) stats:
I (614) FW_MAIN: Supported features are: I (618) FW_MAIN: - WLAN over SDIO I (622) FW_BT: - BT/BLE I (624) FW_BT: - HCI Over SDIO I (626) FW_BT: - BT/BLE dual mode I (630) FW_MAIN: Capabilities: 0x1d I (644) wifi:wifi driver task: 3ffd4bc4, prio:23, stack:6656, core=1 I (644) system_api: Base MAC address is not set I (644) system_api: read default base MAC address from EFUSE I (650) wifi:wifi firmware version: 9f5e450 I (652) wifi:wifi certification version: v7.0 I (656) wifi:config NVS flash: disabled I (658) wifi:config nano formating: disabled I (662) wifi:Init data frame dynamic rx buffer num: 32 I (666) wifi:Init management frame dynamic rx buffer num: 32 I (670) wifi:Init management short buffer num: 32 I (674) wifi:Init dynamic tx buffer num: 64 I (676) wifi:Init static rx buffer size: 1600 I (680) wifi:Init static rx buffer num: 16 I (682) wifi:Init dynamic rx buffer num: 32 I (686) wifi_init: rx ba win: 6 I (690) wifi_init: tcpip mbox: 32 I (692) wifi_init: udp mbox: 6 I (696) wifi_init: tcp mbox: 6 I (698) wifi_init: tcp tx win: 5744 I (702) wifi_init: tcp rx win: 5744 I (704) wifi_init: tcp mss: 1440 I (708) wifi_init: WiFi IRAM OP enabled I (712) wifi_init: WiFi RX IRAM OP enabled I (716) BTDM_INIT: BT controller compile version [425d342] I (724) phy_init: phy_version 4670,719f9f6,Feb 18 2021,17:07:07 I (1048) FW_MAIN: ESP Bluetooth MAC addr: 94-b5-55- b-c2-92

I (1050) FW_SDIO_SLAVE: Using SDIO interface I (1050) gpio: GPIO[5]| InputEn: 0| OutputEn: 1| OpenDrain: 0| Pullup: 0| Pulldown: 0| Intr:0 I (1486) FW_MAIN: Start Data Path I (2058) FW_SDIO_SLAVE: last reset cause: 1x I (2060) FW_MAIN: Initial set up done I (2164) FW_MAIN: INIT Interface command

I (2166) wifi:mode : sta (94:b5:55:0b:c2:90) I (2168) wifi:enable tsf I (2168) FW_CMD: Wifi Sta mode set

I (2188) FW_MAIN: Get MAC command

I (2406) FW_MAIN: Unsupported cmd[0x14] received

hustzju commented 4 months ago

image after lower to 5M, test is running, But I'm not sure if it can always maintain stability.

hustzju commented 4 months ago

image By the way, the only code I have modified is this sentence. if not, esp32 will not enabled. I don't think it is a matter?

mantriyogesh commented 4 months ago

https://github.com/espressif/esp-hosted/issues/362#issuecomment-2048891194

This is okay as long as ESP is reset consistently

hustzju commented 4 months ago

Raw transport test at 50M HZ ,it always run error. By lower the clock, it may be OK, but when running iperf test, it also go wrong. Is this problem related to sdio communication ?

mantriyogesh commented 4 months ago

https://github.com/espressif/esp-hosted/issues/362#issuecomment-2048888489

This looks to be fine. Is the ESP log okay? Longer duration should also should just work.

hustzju commented 4 months ago

https://github.com/espressif/esp-hosted/issues/362#issuecomment-2048877648 https://github.com/espressif/esp-hosted/issues/362#issuecomment-2048878060 please watch the host log at 50M, it goes wrong, esp log seems to be good.

mantriyogesh commented 4 months ago
  1. Have you followed Portting_guide? Check full doc, wrt SDIO. Also make a note of pull-ups requirement, already mentioned in porting guide. Please check if you have 'Burn the flash voltage selection eFuses' in this doc.

  2. By any means are you using jumper cables for the SDIO connection?

hustzju commented 4 months ago

yes, I have followed guide, and I use PCB, not jumper cables. image

mantriyogesh commented 4 months ago

Great! Thanks for the picture.

  1. Pull-ups in place? What is the value?

  2. Also Burnt the efuse?

hustzju commented 4 months ago
  1. we have pull-ups, 10k
  2. burned efuse our esp32 board is pin insert type, like this: image
mantriyogesh commented 4 months ago

Hello @hustzju ,

So does the lower SDIO clock frequency, like 5M, works fine for iperf3 etc (basically, iperf-ttcp-wifi-transport) in rx and tx?

If no, stop here and let us know that first!

Only if 5M works without issues,

Can you please try below and get both ESP and Host logs with higher frequency

  1. Can you please remove three lines from: https://github.com/espressif/esp-hosted/blob/3c3fc1029b00ac46d98ab98495b7a817ddb5bc2d/esp_hosted_ng/esp/esp_driver/network_adapter/main/sdio_slave_api.c#L128-L130 \ This will default the speed flag to 0, which is basically, SDIO_SLAVE_FLAG_HIGH_SPEED

  2. Change the SDIO timing, https://github.com/espressif/esp-hosted/blob/3c3fc1029b00ac46d98ab98495b7a817ddb5bc2d/esp_hosted_ng/esp/esp_driver/network_adapter/main/sdio_slave_api.c#L124-L126 to

    ///// #if defined(CONFIG_IDF_TARGET_ESP32C6)
        .timing             = SDIO_SLAVE_TIMING_NSEND_PSAMPLE,
    ///// #endif
hustzju commented 4 months ago

It also goes wrong at 5M clock when running iperf3, only run longer then 50M. 50M go wrong immediately. and I have tried to uncomment "config.flags |= SDIO_SLAVE_FLAGDEFAULT" and change SDIO timing once, don't work well.

mantriyogesh commented 4 months ago

I have tried to uncomment "config.flags |= SDIO_SLAVE_FLAG_DEFAULT"

I assume, you meant 'comment' instead of 'uncomment' ?

Can you please make sure if SDIO GPIOs are not getting re-used by any other peripherals in your system? Also, all GPIOs have correct voltages populated?

Worst case, can you please try 1-bit sdio mode? You might need to see 'bus_width' to 1 in device tree (if your SoC supports device tree config)

mantriyogesh commented 4 months ago

It also goes wrong at 5M clock when running iperf3

What is the data rate you see when 5M iperf3 is run with TCP? If your raw throughput was around 14Mbps, it could reach 8-12Mbps? Does iperf stays for some ttime or immediately after 1-2 packets goes off?

I am trying to understand if & why sdio lines are behaving unstable.

You can also enable the SDIO checksum at both sides, to check the validity of the packets.

mantriyogesh commented 1 month ago

any updates?