espressif / esp-hosted

Hosted Solution (Linux/MCU) with ESP32 (Wi-Fi + BT + BLE)
Other
675 stars 158 forks source link

Uninstalling esp32_spi.ko and reloading it, the host side did not receive any registration message for esp32. #436

Closed linchanghe123 closed 1 month ago

linchanghe123 commented 2 months ago

Hi mantriyogesh,

The first time loading the esp32_spi.ko driver, the registration and creation of WLAN devices can be completed. But when uninstalling esp32_spi.ko and reloading it, the host side did not receive any registration message for esp32. What's the possibility?

image

image image

mantriyogesh commented 2 months ago

This is not an issue.

Spi setup at host is retained (Linux spi driver). It wouldn't trigger the same log, as it is done one time. However, you would see bootup event once ESP is bootup.everytime.

Also make sure you hook correct 'resetpin' to reliably rest the ESP on every reload of kernel driver.

linchanghe123 commented 2 months ago

Actual testing will not retain the Spi settings on the host.

After assisting with the ESP32_SPI driver, the spi_exit function will be called to remove wlan0. Reloading the ESP32USPI driver will reset the ESP32C3 chip and restart the registration and device creation process. Currently, it seems that we have not gone back to the device creation step.

image image

mantriyogesh commented 2 months ago

textual logs at both sides, would be helpful to understand the scenario. Also, please check if resetpin works fine.

Where is sudo insmod esp32_spi.ko resetpin=XX ?

Screenshot 2024-07-29 at 2 20 31 PM

Please check porting guide.

Unless the expected GPIOs, Handshake , Data_Ready, Reset connected and tested, ESP32 communication cannot work.

linchanghe123 commented 2 months ago

The first time loading the esp32_spi driver, the function is normal. After reloading the driver, looking at the driver code, it was found that the reset pin, handshark, and ready pins were released and reapplied. The logs of esp32c3 also showed that the module had restarted.

mantriyogesh commented 2 months ago

resetpin is module param, to be passed.

How are you making sure the reset pin works?

mantriyogesh commented 2 months ago

also same with handshake and data ready pins.

Please check porting guide first.

linchanghe123 commented 2 months ago

resetpin is module param, to be passed.

How are you making sure the reset pin works?

We have defined a reset pin in the driver. image

By checking the serial port print of esp32-c3, it can be confirmed that the module has been reset and restarted.

linchanghe123 commented 2 months ago

also same with handshake and data ready pins.

Please check porting guide first.

Okay, I'll take a look at the guidance first.

mantriyogesh commented 2 months ago

Sure. If you assign in the driver is also fine.

  1. With C3, if you comment, https://github.com/espressif/esp-hosted/blob/a00a99ba35a805a5166bac8aea6fd4741db6a4e5/esp_hosted_ng/host/spi/esp_spi.c#L598

Does it work?

  1. Can you please send full textual logs :
    • ESP : minicom or idf.py monitor log from start
    • Host: dmesg log from system boot up

Anyway, many commonly faced scenarios are covered in porting guide, if you can have a look.

linchanghe123 commented 2 months ago

The first time loading the esp32_spi.ko driver, the host's log : image ESP log : image

rmmod esp32_spi host log : There are no new records esp log : image

reload esp32_spi.ko host log : This is an additional record image esp log : image image

mantriyogesh commented 2 months ago

Line / debug trace 'spi clock [30]' is expected to be only produced once. Please state your issue clearly. The problem you are reporting, is not a problem, it is expected that way to be run.

Check code:

https://github.com/espressif/esp-hosted/blob/a00a99ba35a805a5166bac8aea6fd4741db6a4e5/esp_hosted_ng/host/spi/esp_spi.c#L446

Code is very simple to browse.

static int __init esp_init(void)
ret = esp_init_interface_layer(adapter, clockspeed); 

https://github.com/espressif/esp-hosted/blob/a00a99ba35a805a5166bac8aea6fd4741db6a4e5/esp_hosted_ng/host/spi/esp_spi.c#L603-L620

https://github.com/espressif/esp-hosted/blob/a00a99ba35a805a5166bac8aea6fd4741db6a4e5/esp_hosted_ng/host/spi/esp_spi.c#L529

https://github.com/espressif/esp-hosted/blob/a00a99ba35a805a5166bac8aea6fd4741db6a4e5/esp_hosted_ng/host/spi/esp_spi.c#L446-L448

This is expected once in driver loading.

Unless you provide me the steps you run to load and unload the driver, and associated texual logs attached, from start, we would not be able to debug further.

We will wait for:

  1. Clear issue description. Just that log is visible or not, cannot be an issue. Please state clearly what you are not able to achieve.
  2. Logs in text files to be attached. We would not be able to check photos attached. We need full logs from boottup for ESP and host.
    • ESP - minicom
    • Host
    • dmesg from bootup
    • steps followed
    • code changes used at both places
    • base commit used at both places
linchanghe123 commented 1 month ago

After uninstalling esp32_spi.ko, the bootup can be triggered again by uninstalling and reloading both the spidev and spidev drivers.

In addition, the spi_busnum_to_master function also needs to be modified in order to function properly.

Feedback on a situation where kernel versions such as V6.1 have removed spi_busnum_to_master function.

mantriyogesh commented 1 month ago

by uninstalling and reloading both the spidev and spidev drivers.

Can you please share exact steps that you do for workaround? I am still thinking why would this be needed. spidev should never be enabled on the bus-cs instance we intend to work on.

In addition, the spi_busnum_to_master function also needs to be modified in order to function properly.

Any changes you had to do? Can you please share those? Alternatively, you can also submit PR, which we can merge to master upon testing our side.

Feedback on a situation where kernel versions such as V6.1 have removed spi_busnum_to_master function.

We have added this function already if the kernel is not providing definition of this function here: https://github.com/espressif/esp-hosted/blob/6ddb670edcf15108be92d7e8e4fc6f32542ef5e4/esp_hosted_ng/host/spi/esp_spi.c#L368-L390

#if (LINUX_VERSION_CODE >= KERNEL_VERSION(5, 16, 0))
...

    function definition for spi_busnum_to_master() {
     ...
    }
...
#endif

Do you experience any issues while using this function?

linchanghe123 commented 1 month ago

--- Edit by @mantriyogesh, Reason: formatting markdown for code --- --- Content is unchanged --

Can you please share exact steps that you do for workaround? I am still thinking why would this be needed. spidev should never be enabled on the bus-cs instance we intend to work on.

The main issue is that spi-stm32.ko needs to be reloaded, and upon retesting, it was found that it is indeed unrelated to spidev.ko Here is the record of the operation :


/data0 # insmod spi-stm32.ko
/data0 # insmod esp32_spi.ko
/data0 # ifconfig -a
can0      Link encap:UNSPEC  HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
NOARP  MTU:16  Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:10
RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)
Interrupt:58

eth0 Link encap:Ethernet HWaddr CA:1B:D4:6A:ED:4E inet addr:192.168.253.10 Bcast:192.168.253.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:1660 errors:0 dropped:0 overruns:0 frame:0 TX packets:2475 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:117532 (114.7 KiB) TX bytes:286135 (279.4 KiB) Interrupt:56 Base address:0x4000

lo Link encap:Local Loopback LOOPBACK MTU:65536 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)

wlan0 Link encap:Ethernet HWaddr 80:65:99:97:92:A8 BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)

/data0 # rmmod esp32_spi.ko /data0 # insmod esp32_spi.ko /data0 # ifconfig -a can0 Link encap:UNSPEC HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 NOARP MTU:16 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:10 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) Interrupt:58

eth0 Link encap:Ethernet HWaddr CA:1B:D4:6A:ED:4E inet addr:192.168.253.10 Bcast:192.168.253.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:1936 errors:0 dropped:0 overruns:0 frame:0 TX packets:2923 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:135946 (132.7 KiB) TX bytes:337819 (329.9 KiB) Interrupt:56 Base address:0x4000

lo Link encap:Local Loopback LOOPBACK MTU:65536 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)

/data0 # rmmod esp32_spi.ko /data0 # rmmod spi-stm32.ko /data0 # insmod spi-stm32.ko /data0 # insmod esp32_spi.ko /data0 # ifconfig -a can0 Link encap:UNSPEC HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 NOARP MTU:16 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:10 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) Interrupt:58

eth0 Link encap:Ethernet HWaddr CA:1B:D4:6A:ED:4E inet addr:192.168.253.10 Bcast:192.168.253.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:2228 errors:0 dropped:0 overruns:0 frame:0 TX packets:3345 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:156280 (152.6 KiB) TX bytes:386211 (377.1 KiB) Interrupt:56 Base address:0x4000

lo Link encap:Local Loopback LOOPBACK MTU:65536 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)

wlan0 Link encap:Ethernet HWaddr 80:65:99:97:92:A8 BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)

/data0 #


> Any changes you had to do? Can you please share those? Alternatively, you can also submit PR, which we can merge to master upon testing our side.
> 
The kernel version we are currently using is **less than 5.16**.
The `spi_busnum_to_master` function needs to add `put_device` after `get_device`, **otherwise reloading the spi-stm32 driver and the `spi_get_gpio_desc` function will report an exception.**

```c
struct spi_controller *spi_busnum_to_master(u16 bus_num)
{
        struct device           *dev;
        struct spi_controller   *ctlr = NULL;

        dev = class_find_device(&spi_master_class, NULL, &bus_num,
                                __spi_controller_match);
        if (dev)
                ctlr = container_of(dev, struct spi_controller, dev);
        put_device(dev);      // This is the added code
        /* reference got in class_find_device */
        return ctlr;
}
mantriyogesh commented 1 month ago

Actually, something is not right, as we do not need this on multiple Linux devices tested. While reloading, the SPI instance when closed from driver, new instance added on insert, followed by spi_init, spi_dev_init. So as you correctly suspected/identified, older driver was not clean, it would have implications on new reload of driver.

Additionally, I compared your code with existing code of ESP-Hosted-NG, https://github.com/espressif/esp-hosted/blob/6ddb670edcf15108be92d7e8e4fc6f32542ef5e4/esp_hosted_ng/host/spi/esp_spi.c#L384-L413

I see, that https://github.com/espressif/esp-hosted/blob/6ddb670edcf15108be92d7e8e4fc6f32542ef5e4/esp_hosted_ng/host/spi/esp_spi.c#L408-L410

is different from your function. But I am not sure if it is real difference, or you had intentionally removed in display for simplification. Anyway, I would request you to confirm on this.

Again, we are very much happy that you had reported this issue, specifically your experience on STM board. We are still interested to get to the bottom of this issue, if you have time for this.

Changing the code to master branch, without changes (except porting related changes), either side could be flashed again and check if scenario loads fine?