Open zachmorr opened 2 months ago
Additional information, connecting MOSI to MISO and running spidev_test works as expected for all speeds so it doesn't seem to be a HW or driver issue.
Can you please attach the captures?
Renamed from .sal to .txt to get past the Github filter.
Will check tomorrow in system, and come up with observations
In the either logs, the MISO looks just fine (for first transaction). I am not sure why the host did not understand MISO correctly in 5 MHz case.
What was the SPI mode in case of 5MHz? From the sal dumps, I could sense the mode 2 was used in both cases.
From both the SPI salae analyser, bootup event was exactly matched in both the cases.
Once you receive the buffer, what was the interpreted data if you enable prints:
host tx: https://github.com/espressif/esp-hosted/blob/a9d210c2a32683cd577dcc947f66b16ef89e012d/esp_hosted_ng/host/spi/esp_spi.c#L323 change to esp_hex_dump("tx: ", trans.tx_buf, 32);
host rx https://github.com/espressif/esp-hosted/blob/a9d210c2a32683cd577dcc947f66b16ef89e012d/esp_hosted_ng/host/spi/esp_spi.c#L351 esp_hex_dump("Rx: ", trans.rx_buf, 32);
This issue is very weird, can you please test it independently with ESP-IDF once?
Also, do you have any other esp chipset with you? For example, ESP32-C2/C3/C6/S3?
It would also be interesting, if the host is reliably working with higher frequencies.
Do you have any code changes locally either side?
Here are the logs with the hexdump added 10mhz_hexdump.txt 5mhz_hexdump.txt
It looks like the bits in the 5mhz logs are shifted left by 1... super weird.
I had worked with Allwinner t507 recently and luckily I had hardware and found some issues only with allwinner t507 hardware.
But I had used ESP-Hosted-FG for it. Anyway, at SPI level both FG and NG are just the same. If you could test with it, here is reference code:
It barely works at 10MHz, too many signal integrity issues and I can't make data/clk wires shorter
there is a module parameter to set the clock rate. you should also be able to change the clock speed after the module is loaded
@mantriyogesh is there a commit that contains the changes required to get the T507 working? I can try to get the FG version working but that might take awhile.
@Mr-Bossman yes that is what I'm using to switch between 10 and 5 MHz. How you do change the speed after the module has already been loaded?
I'm building with Buildroot, here are the patches I'm applying. I wouldn't expect any of these to cause issues
They look goood, thats how you change the pins. Are you sure you are using the right spi number and CS pin?
@Mr-Bossman yes that is what I'm using to switch between 10 and 5 MHz.
Why not try 1MHz?
How you do change the speed after the module has already been loaded?
I don't remember exactly but I used sysfs
@zachmorr I provided full code at https://github.com/espressif/esp-hosted/issues/424#issuecomment-2220290441
I think better not to intgrate full, instead first just check if FG basic operations work, with the spi bus, cs, mode, handshake, dataready & resetpin changed at host kernel driver similar to your current ng conf
The code includes ESP and host, specially tested with t507.
It has changed at slave side majorly in spi_slave_api.c
Once you test basic ping test on FG, you would have confidence and also can compare this file and merge manually.
Attaching the link for complete code both ESP and host side: esp_hosted_umltech_2024.07.09_ota_fix.tgz
@mantriyogesh I decided to try some experiments before giving hosted_fg a try and I think I made some progress. I found this which mentions issues with SPI & DMA on ESP32 when in mode 2, so I changed everything to mode 3 and was able to get farther in the boot process at 5 and 1MHz. Here are the logs:
I'd be interested to find out if this issue also occurs on the RPI or if it's specific to my chip.
Update: I got a ESP32C3, I was able to lower the clock speed to 1MHz in mode 2 without issue (ignoring the BT error) esp32c3_1mhz_log.txt
I'd be interested to find out if this issue also occurs on the RPI or if it's specific to my chip.
The driver is tested fine daily on Raspberry Pi. If you have raspberry pi, you can also evaluate. I still see timeout errors in your log. Instead of running normal scenario test the transport alone, using raw throughout test in rx and tx.
@mantriyogesh I decided to try some experiments before giving hosted_fg a try and I think I made some progress. I found this which mentions issues with SPI & DMA on ESP32 when in mode 2
We really appreciate the debugging efforts you take. Honestly we know this. It would be worth to check your timing, but I suspect it would be the problem or not. The reason being, it wouldn't have worked for 10MHz either in that case.
I think it should be fairly simple to evaluate FG. All you need to build in place for code I sent. No need to integrate to you project code yet. Just copy at both places and build. The reason I insist is we tested this on t507. Default code did not work (what differs, we couldn't get it working even till what you got working). So I am not saying it would 100% work, but worth a try.
I hope you had gone through porting guide for any simple but generally ignored by overlook cases..
@mantriyogesh looks like you posted your comment at the same time I edited mine. I added an update about using the ESP32C3
Oh good that you told, I missed it.
But still the bluetooth packet was missed to tx. Reason why it missed, unless rx and tx logs enabled both ESP and host, can't comment it as perfect.
Can you please comment update_spi_clock() in host spi/esp_spi.c ? Keep it commented for all chipsets to further lower the scope, unless you have working with the spi frequency you provide..
I think raw throughout test in either direction would be a better test for any chipset.
Sorry, but I am still interested in https://github.com/espressif/esp-hosted/issues/424#issuecomment-2227521630, if you could give it a chance..
did you proceed ahead @zachmorr ?
I am working on getting FG version running. In the meantime, here are the results from the throughput test I ran on the ESP32C3. It doesn't look great.
esp2host_esp.txt esp2host_host.txt host2esp_esp.txt host2esp_host.txt
EDIT: Are there instructions somewhere for how to build the FG version? I'm running into a lot of errors.
Miso fine, mosi not, some issue in communication clearly as per your logs.
So interrupt gpios might not be correctly working. See the porting guide, you need 3 additional gpios. Handshake, data ready and resetpin. Handshake and dataready are incoming gpios for host, resetpin is outgoing gpio for host.
Wrt fg,
Extract the tarball
esp_hosted_umltech_2024.07.09_ota_fix.tgz
In directory, say new_fg
: Build FG firmware (3.1) is important. documentation for source based build on FG : https://github.com/espressif/esp-hosted/blob/master/esp_hosted_fg/docs/Linux_based_host/SPI_setup.md#222-source-compilation for ESP32-C3 building
3.1. There is no IDF in the attached code (removed for lowering size)
You can go to your current existing code esp_hosted_fg/esp/esp_driver
And follow first step, Set-up ESP-IDF
from there
3.2. Once you are done with '. ./export.sh',
Navigate to new_fg/code/esp_hosted_fg/esp/esp_driver/ and follow next step i.e Configure, Build & Flash SPI ESP firmware
from new_fg.
Please make sure that your handshake
, dataready
and resetpin
GPIOs are correctly working, else even this code is expected to fail.
Miso fine, mosi not, some issue in communication clearly as per your logs.
How did you conclude that? If I'm reading the logs right, it looks like there is data loss in both directions.
Please make sure that your
handshake
,dataready
andresetpin
GPIOs are correctly working, else even this code is expected to fail.
They are connected correctly. The throughput test wouldn't have been able to start if they weren't. It wasn't in the logs I sent, but the host was able to process the bootup events and print the capabilities.
3. Build FG firmware
I was able to build and flash the fg firmware for the host and esp and there were no errors when I loaded them, but I am getting errors when I try to build the C demo app. I also tried building with RAW TP mode enabled but that didn't seem to do anything:
# modprobe esp32_spi_fg resetpin=111 raw_tp_mode=1
[ 14.892318] esp32_spi: loading out-of-tree module taints kernel.
[ 14.900471] esp32_spi: unknown parameter 'raw_tp_mode' ignored
[ 14.910837] ESP: SPI host config: GPIOs: Handshake[113] DataReady[112]
[ 14.912569] ESP host driver claiming SPI bus [0],chip select [0] with init SPI Clock [10]
[ 14.928015] esp spi thread created
# [ 15.619418] INIT event rcvd from ESP
[ 15.623137] EVENT: 2
[ 15.625510] EVENT: 1
[ 15.627792] ESP Reconfigure SPI CLK to 30 MHz
[ 15.632172] EVENT: 0
[ 15.634448] EVENT: 3
[ 15.636666] ESP peripheral RAW TP capabilities: 0x0
[ 15.641551] esp32: stop raw throuput test if running
[ 15.651899] ESP peripheral capabilities: 0xe8
I (412) main_task: Calling app_main()
I (412) NETWORK_ADAPTER: *********************************************************************
I (422) NETWORK_ADAPTER: ESP-Hosted-FG Firmware version :: 0.0.5
I (432) NETWORK_ADAPTER: Transport used :: SPI only
I (442) NETWORK_ADAPTER: *********************************************************************
I (452) NETWORK_ADAPTER: Supported features are:
I (462) NETWORK_ADAPTER: - WLAN over SPI
I (462) ESP_BT: - BT/BLE
I (462) ESP_BT: - HCI Over SPI
I (472) ESP_BT: - BLE only
I (472) NETWORK_ADAPTER: capabilities: 0xe8
I (482) BLE_INIT: BT controller compile version [85b425c]
I (482) phy_init: phy_version 970,1856f88,May 10 2023,17:44:12
I (532) BLE_INIT: Bluetooth MAC: 10:91:a8:20:b5:5e
I (532) NETWORK_ADAPTER: ESP Bluetooth MAC addr: 10:91:a8:20:b5:5e
I (532) SPI_DRIVER: Using SPI interface
I (542) gpio: GPIO[3]| InputEn: 0| OutputEn: 1| OpenDrain: 0| Pullup: 0| Pulldown: 0| Intr:0
I (552) gpio: GPIO[4]| InputEn: 0| OutputEn: 1| OpenDrain: 0| Pullup: 0| Pulldown: 0| Intr:0
I (562) SPI_DRIVER: SPI Ctrl:1 mode: 2, InitFreq: 10MHz, ReqFreq: 30MHz
GPIOs: MOSI: 7, MISO: 2, CS: 10, CLK: 6 HS: 3 DR: 4
I (572) SPI_DRIVER: Hosted SPI queue size: Tx:20 Rx:20
I (582) gpio: GPIO[10]| InputEn: 0| OutputEn: 0| OpenDrain: 0| Pullup: 1| Pulldown: 0| Intr:0
I (582) gpio: GPIO[10]| InputEn: 1| OutputEn: 0| OpenDrain: 0| Pullup: 1| Pulldown: 0| Intr:0
I (602) pp: pp rom version: 9387209
I (602) net80211: net80211 rom version: 9387209
I (612) wifi:wifi driver task: 3fcc20a4, prio:23, stack:6656, core=0
I (612) wifi:wifi firmware version: b2f1f86
I (612) wifi:wifi certification version: v7.0
I (622) wifi:config NVS flash: disabled
I (622) wifi:config nano formating: disabled
I (622) wifi:Init data frame dynamic rx buffer num: 32
I (632) wifi:Init management frame dynamic rx buffer num: 32
I (632) wifi:Init management short buffer num: 32
I (642) wifi:Init dynamic tx buffer num: 32
I (642) wifi:Init static tx FG buffer num: 2
I (652) wifi:Init static rx buffer size: 1600
I (652) wifi:Init static rx buffer num: 10
I (662) wifi:Init dynamic rx buffer num: 32
I (662) wifi_init: rx ba win: 6
I (662) wifi_init: tcpip mbox: 32
I (672) wifi_init: udp mbox: 6
I (672) wifi_init: tcp mbox: 6
I (672) wifi_init: tcp tx win: 5744
I (682) wifi_init: tcp rx win: 5744
I (682) wifi_init: tcp mss: 1440
I (692) wifi_init: WiFi IRAM OP enabled
I (692) wifi_init: WiFi RX IRAM OP enabled
I (702) wifi:mode : null
I (702) NETWORK_ADAPTER: Initial set up done
I (702) slave_ctrl: event ESPInit
I (712) main_task: Returned from app_main()
This is using the unmodified version btw. I wanted to make sure I was building everything correctly before trying your modified version.
Little confused! Can you please clarify tthe FG you picked is from esp_hosted_umltech_2024.07.09_ota_fix.tgz ?
Raw throughput testing is by default disabled in firmware, to lower the code & data (after all, it is debugging method, not expected later in production). With above code, to run raw throughput test please refer: FG-Raw-TP-Test
You might need to enable/load the raw throughput code (based on debug symbol while building) both sides. check doc, FG-Raw-TP-Test
Please check you are on the latest master at both sides to use this.
rpi_init.sh
is helper script, that you can use for easy handling, https://github.com/espressif/esp-hosted/blob/20ec7738f0f19b48758f5c0c2539d5091cee1ce7/esp_hosted_fg/host/linux/host_control/rpi_init.sh#L332
\
You can easily enable /disable options that you wish to use or discard (preferred way)
As an alternative, if you wish to directly use Makefile, please enable at host: https://github.com/espressif/esp-hosted/blob/20ec7738f0f19b48758f5c0c2539d5091cee1ce7/esp_hosted_fg/host/linux/host_driver/esp32/Makefile#L2
Change the flags as per documentation one by one in either direction and flash the esp
They are connected correctly. The throughput test wouldn't have been able to start if they weren't. It wasn't in the logs I sent, but the host was able to process the bootup events and print the capabilities
It is only one transaction, host by default is going to make first transaction, whereas ESP is prepared for transaction already. Only getting first transaction fine cannot confirm the handshake and data-ready and reset pin GPIOs. It confirms, that 3 pins, CLK, CS, MISO behaved fine (at least till first transaction).
I am not sure how did you come to conclusion, the rest pins are fine. Did you test these pins in any way?
To get complete ESP-Hosted working, you would need to set up these things and verify they are working fine. These details are already available in documentation (how to verify etc) at https://github.com/espressif/esp-hosted/blob/master/esp_hosted_fg/docs/Linux_based_host/porting_guide.md#242-spi
- Little confused! Can you please clarify tthe FG you picked is from esp_hosted_umltech_2024.07.09_ota_fix.tgz ?
For the previous tests I was using the mainline esp_hosted to make sure I was compiling and loading everything correctly. I tried compiling esp_hosted_umltech today but ran into an issue that I don't know how to solve. Here is the makefile I used and the errors if you're interested.
esp_hosted_umltech_compile_failure.txt Makefile.txt
I can't use the rpi_init.sh script unfortunately, this device is a minimal busybox setup without make or gcc. I have to cross compile it from my PC.
I am not sure how did you come to conclusion, the rest pins are fine. Did you test these pins in any way?
I know RST is working because I can tell from the idf.py monitor that the ESP is being reset. If I disconnect DataReady or Handshake the host doesn't receive any events from the ESP which is why I believe they are working. Here are the logs from my test
To be extra sure, I also took a capture with the Saleae when I ran the mainline esp_hosted_fg and didn't see any obvious issues
but ran into an issue that I don't know how to solve
Please try:
To be extra sure, I also took a capture with the Saleae when I ran the mainline esp_hosted_fg and didn't see any obvious issues
Capture observations:
I think I missed your one of earlier comment to interpret correctly, from https://github.com/espressif/esp-hosted/issues/424#issuecomment-2244266007,
but I am getting errors when I try to build the C demo app
Check if the current c_support/Makefile with cross-compilation changed works for your machine? Please note 'static' there in Makefile while building, for glibc issues. Try to remove and build, if it doesn't work, then reintroduce.
I was able to compile with those changes but no luck running it. I tried with SPI mode 2 and 3 but it didn't make a difference
# modprobe esp32_spi_fg resetpin=111
[ 26.013017] esp32_spi: loading out-of-tree module taints kernel.
[ 26.022386] esp32_spi: esp_reset: --- Triggering ESP reset. ----
[ 26.031417] esp32_spi: spi_init: ESP: SPI host config: GPIOs: Handshake[113] DataReady[112]
[ 26.041633] esp32_spi: spi_dev_init: ESP host driver claiming SPI bus [0],chip select [0] ]
[ 26.052598] esp32_spi: spi_dev_init: gpio_to irq: 37 36
[ 26.058062] esp32_spi: esp_spi_thread: esp spi thread created
#
I (411) main_task: Calling app_main()
I (411) NETWORK_ADAPTER: *********************************************************************
I (421) NETWORK_ADAPTER: ESP-Hosted-FG Firmware version :: 0.0.5
I (431) NETWORK_ADAPTER: Transport used :: SPI only
I (441) NETWORK_ADAPTER: *********************************************************************
I (451) NETWORK_ADAPTER: Supported features are:
I (461) NETWORK_ADAPTER: - WLAN over SPI
I (461) ESP_BT: - BT/BLE
I (461) ESP_BT: - HCI Over SPI
I (471) ESP_BT: - BLE only
I (471) NETWORK_ADAPTER: capabilities: 0xe8
I (481) BLE_INIT: BT controller compile version [963cad4]
I (481) phy_init: phy_version 970,1856f88,May 10 2023,17:44:12
I (541) BLE_INIT: Bluetooth MAC: 10:91:a8:20:b5:5e
I (541) NETWORK_ADAPTER: ESP Bluetooth MAC addr: 10:91:a8:20:b5:5e
I (541) SPI_DRIVER: Using SPI interface
E (541) gpio: GPIO_PIN mask error
I (551) gpio: GPIO[4]| InputEn: 0| OutputEn: 1| OpenDrain: 0| Pullup: 0| Pulldown: 0| Intr:0
I (561) SPI_DRIVER: SPI Ctrl:1 mode: 3, InitFreq: 10MHz, ReqFreq: 30MHz
GPIOs: MOSI: 7, MISO: 2, CS: 10, CLK: 6 HS: 26 DR: 4
I (571) SPI_DRIVER: Hosted SPI queue size: Tx:20 Rx:20
I (571) gpio: GPIO[10]| InputEn: 0| OutputEn: 0| OpenDrain: 0| Pullup: 1| Pulldown: 0| Intr:0
I (581) gpio: GPIO[10]| InputEn: 1| OutputEn: 0| OpenDrain: 0| Pullup: 1| Pulldown: 0| Intr:0
pp rom version: 9387209
net80211 rom version: 9387209
I (611) wifi:wifi driver task: 3fcc230c, prio:23, stack:6656, core=0
I (611) wifi:wifi firmware version: fddc5e5
I (611) wifi:wifi certification version: v7.0
I (611) wifi:config NVS flash: disabled
I (621) wifi:config nano formating: disabled
I (621) wifi:Init data frame dynamic rx buffer num: 32
I (621) wifi:Init management frame dynamic rx buffer num: 32
I (631) wifi:Init management short buffer num: 32
I (631) wifi:Init dynamic tx buffer num: 32
I (641) wifi:Init static tx FG buffer num: 2
I (641) wifi:Init static rx buffer size: 1600
I (651) wifi:Init static rx buffer num: 10
I (651) wifi:Init dynamic rx buffer num: 32
I (651) wifi_init: rx ba win: 6
I (661) wifi_init: tcpip mbox: 32
I (661) wifi_init: udp mbox: 6
I (671) wifi_init: tcp mbox: 6
I (671) wifi_init: tcp tx win: 5744
I (671) wifi_init: tcp rx win: 5744
I (681) wifi_init: tcp mss: 1440
I (681) wifi_init: WiFi IRAM OP enabled
I (691) wifi_init: WiFi RX IRAM OP enabled
I (691) wifi:mode : null
I (691) NETWORK_ADAPTER: Initial set up done
I (701) slave_ctrl: event ESPInit
I (30701) NETWORK_ADAPTER: pos_rx_drp[0] HS: off[1] on[1] DR off[1] on[2]
I (60701) NETWORK_ADAPTER: pos_rx_drp[0] HS: off[1] on[1] DR off[1] on[2]
I (90701) NETWORK_ADAPTER: pos_rx_drp[0] HS: off[1] on[1] DR off[1] on[2]
I (120701) NETWORK_ADAPTER: pos_rx_drp[0] HS: off[1] on[1] DR off[1] on[2]
I (150701) NETWORK_ADAPTER: pos_rx_drp[0] HS: off[1] on[1] DR off[1] on[2]
I (180701) NETWORK_ADAPTER: pos_rx_drp[0] HS: off[1] on[1] DR off[1] on[2]
Capture shows all sorts of problems umtech_fg.txt
ESP side GPIOs in use: GPIOs: MOSI: 7, MISO: 2, CS: 10, CLK: 6 HS: 26 DR: 4
Are these connected to your expected GPIOs? You might have to change as per your C3.
same is the case for host side GPIOs. You might need to correct the GPIOs to be used , spi_bus, spi_cs, mode and clock freq.
Whoops, looks like HS was wrong. Here are the logs and capture after updating, it is still failing. updated_fg_capture.txt updated_fg_esp.txt updated_fg_host.txt
EDIT: Looks like I had the esp in mode 2 for the capture, but switching it back to mode 3 to match the host did not make a difference
Is now observe that the HS and DR are not triggered correctly from ESP. Can you please help to check if interrupts are reaching kernel module?
I changed to mode 3 in salae capture: \ Not sure why, It appears that 1 bit is shifted in MOSI.
Are the SPI modes used in esp and host the same?
What is surprising, the consecutive transaction, I can see the tx_pkt_number in MOSI to be correct, 0x33
. But the RX packet number is 0x02
, which is again not correct. Should have been 0x01
.
Basically, every alternate MISO is skipped, typically seen on t507.
But the code change in esp_hosted_umltech_2024.07.09_ota_fix.tgz should have ideally fixed it,
\
The original fix in the that code was to wait for the CS to de-assert
and then only schedule the next transaction on MISO (from esp)
\
Can you please confirm this change is present in your code?
This is still a puzzle, Just trial, not much hope, can you try to change and flash esp and retry:
Let us know if change in (6) works, if not, please revert this change.
It's been a bit, but I was able to figure out the problem with the C3 board. I'll post it here in case anyone has the same issue. For whatever reason gpio interrupts weren't being triggered. I added what @mantriyogesh recommended here: https://github.com/espressif/esp-hosted/issues/411#issuecomment-2198265906 and that fixed the issue. I was never able to figure out why I couldn't lower the clock speed on the regular ESP32 but that wasn't what I was planning on using anyway so I probably won't put any more effort into it.
Loading kernel module with a clockspeed less than 7MHz causes it to fail to initialize. Below is the output from loading the module at 10MHz and 5MHz. The logic analyzer captures show the ESP32 responding when the host starts the 5MHz clock but the software fails to pick up the response. Saleae capture files are available if needed.
Hardware: Custom Allwinner F1C200S Board w/ ESP32 DevkitC V4 ESP NG Firmware version: 1.0.3
Logs: 5mhz_host.txt 5mhz_slave.txt 10mhz_host.txt 10mhz_slave.txt
5 MHz
10 MHz
Hardware Setup