zephyrproject-rtos / zephyr

Primary Git Repository for the Zephyr Project. Zephyr is a new generation, scalable, optimized, secure RTOS for multiple hardware architectures.
https://docs.zephyrproject.org
Apache License 2.0
10.87k stars 6.62k forks source link

Socket echo server sample code not working in Litex Vexriscv cpu (Xilinx AC701 board) #42685

Closed Aravindh-Swaminathan closed 2 years ago

Aravindh-Swaminathan commented 2 years ago

Describe the bug A clear and concise description of what the bug is. After building the firmware and flashing it on to the hardware, the Zephyr RTOS boots as expected. But while running telnet 4242 from host terminal, its unable to connect to remote host.

To Reproduce Steps to reproduce the behavior:

1) Build litex: python3 ./xilinx_ac701.py --build --csr-csv test/csr.csv --csr-json test/csr.json

2) Generate dts overlay: litex/litex/tools/litex_json2dts_zephyr.py --dts overlay.dts --config overlay.config csr.json

3) Change device IP address adn peer IP address in sample code with same subnet as the linux host in zephyr/samples/net/sockets/echo/proj.config

4) Build sample code: west build -b litex_vexriscv samples/sockets/echo --pristine -DDTC_OVERLAY_FILE=/Users/user/project/litex/litex-boards/litex_boards/targets/test/overlay.dts -DCONFIG_UART_LITEUART=y -DCONFIG_LITEX_TIMER=y -DCONFIG_ETH_LITEETH=n -DCONFIG_SPI_LITESPI=n -DCONFIG_I2C_LITEX=n

5) Load kernel onto platform: sudo lxterm /dev/ttyUSB2 --kernel zephyr_eth_net_v12.bin --kernel-adr 0x40000000

6) Load bitstream: sudo openFPGALoader -c digilent xilinx_ac701.bit

7) Once hardware boots up, in the host terminal try telnet 4242

Expected behavior The host terminal is unable to connect to remote host.

cobit-01@clr-7793e0c675444c1b82a24479ba6802f6~/lalit/zephyr_rtos/zephyr_network_stack $ telnet 192.168.1.99 4242
Trying 192.168.1.99...
telnet: Unable to connect to remote host: No route to host

Impact Showstopper

Logs and console output

        __   _ __      _  __
       / /  (_) /____ | |/_/
      / /__/ / __/ -_)>  <
     /____/_/\__/\__/_/|_|
   Build your hardware, easily!

 (c) Copyright 2012-2022 Enjoy-Digital
 (c) Copyright 2007-2015 M-Labs

 BIOS built on Feb  8 2022 11:19:42
 BIOS CRC passed (9bcc684c)

 Migen git sha1: 9a0be7a
 LiteX git sha1: 9482cbc8

--=============== SoC ==================--
CPU:            VexRiscv @ 100MHz
BUS:            WISHBONE 32-bit @ 4GiB
CSR:            8-bit data
ROM:            128KiB
SRAM:           8KiB
L2:             8KiB
SDRAM:          1048576KiB 16-bit @ 800MT/s (CL-7 CWL-5)

--========== Initialization ============--
Ethernet init...
Initializing SDRAM @0x40000000...
Switching SDRAM to software control.
Write latency calibration:
m0:0 m1:0
Read leveling:
  m0, b00: |00000000000000000000000000000000| delays: -
  m0, b01: |00000000000000000000000000000000| delays: -
  m0, b02: |00000000000000000000000000000000| delays: -
  m0, b03: |00111111111111100000000000000000| delays: 08+-06
  m0, b04: |00000000000000000111111111111100| delays: 23+-06
  m0, b05: |00000000000000000000000000000000| delays: -
  m0, b06: |00000000000000000000000000000000| delays: -
  m0, b07: |00000000000000000000000000000000| delays: -
  best: m0, b04 delays: 23+-06
  m1, b00: |00000000000000000000000000000000| delays: -
  m1, b01: |00000000000000000000000000000000| delays: -
  m1, b02: |00000000000000000000000000000000| delays: -
  m1, b03: |11111111111111100000000000000000| delays: 08+-07
  m1, b04: |00000000000000000111111111111110| delays: 24+-07
  m1, b05: |00000000000000000000000000000000| delays: -
  m1, b06: |00000000000000000000000000000000| delays: -
  m1, b07: |00000000000000000000000000000000| delays: -
  best: m1, b03 delays: 08+-07
Switching SDRAM to hardware control.
Memtest at 0x40000000 (2.0MiB)...
  Write: 0x40000000-0x40200000 2.0MiB
   Read: 0x40000000-0x40200000 2.0MiB
Memtest OK
Memspeed at 0x40000000 (Sequential, 2.0MiB)...
  Write speed: 27.2MiB/s
   Read speed: 27.3MiB/s

--============== Boot ==================--
Booting from serial...
Press Q or ESC to abort boot completely.
sL5DdSMmkekro
[LXTERM] Received firmware download request from the device.
[LXTERM] Uploading zephyr_eth_net_v12.bin to 0x40000000 (145100 bytes)...
[LXTERM] Upload calibration... (inter-frame: 10.00us, length: 64)
[LXTERM] Upload complete (9.9KB/s).
[LXTERM] Booting the device.
[LXTERM] Done.
Executing booted program at 0x40000000

--============= Liftoff! ===============--
*** Booting Zephyr OS build v2.7.99-2698-gf92542ecf3e4  ***
Single-threaded TCP echo server waits for a connection on port 4242...

Environment (please complete the following information):

Aravindh-Swaminathan commented 2 years ago

Update:

Made the following changes to the proj.conf file


CONFIG_DEBUG=y

# console
CONFIG_RTT_CONSOLE=n
CONFIG_UART_CONSOLE=y

# shell
CONFIG_SHELL=y
CONFIG_SHELL_BACKENDS=y
CONFIG_SHELL_BACKEND_SERIAL=y
CONFIG_SHELL_PROMPT_UART="uart:~$ "

# logging
CONFIG_LOG=y
CONFIG_LOG_BACKEND_RTT=n
CONFIG_LOG_BACKEND_UART=y
#
CONFIG_CONSOLE=y
CONFIG_UART_CONSOLE=y
CONFIG_PINMUX=y
CONFIG_NETWORKING=y
CONFIG_NET_LOG=y
CONFIG_NET_IF_MCAST_IPV6_ADDR_COUNT=4
CONFIG_NET_DHCPV4=y
CONFIG_NET_SHELL=y

recompiled the firmware and the Hardware logs are as follows.

BIOS built on Feb 8 2022 11:19:42 BIOS CRC passed (9bcc684c)

Migen git sha1: 9a0be7a LiteX git sha1: 9482cbc8


--=============== SoC ==================--
CPU:            VexRiscv @ 100MHz
BUS:            WISHBONE 32-bit @ 4GiB
CSR:            8-bit data
ROM:            128KiB
SRAM:           8KiB
L2:             8KiB
SDRAM:          1048576KiB 16-bit @ 800MT/s (CL-7 CWL-5)

--========== Initialization ============--
Ethernet init...
Initializing SDRAM @0x40000000...
Switching SDRAM to software control.
Write latency calibration:
m0:0 m1:0
Read leveling:
  m0, b00: |00000000000000000000000000000000| delays: -
  m0, b01: |00000000000000000000000000000000| delays: -
  m0, b02: |00000000000000000000000000000000| delays: -
  m0, b03: |00111111111111100000000000000000| delays: 08+-06
  m0, b04: |00000000000000000111111111111100| delays: 23+-06
  m0, b05: |00000000000000000000000000000000| delays: -
  m0, b06: |00000000000000000000000000000000| delays: -
  m0, b07: |00000000000000000000000000000000| delays: -
  best: m0, b04 delays: 23+-06
  m1, b00: |00000000000000000000000000000000| delays: -
  m1, b01: |00000000000000000000000000000000| delays: -
  m1, b02: |00000000000000000000000000000000| delays: -
  m1, b03: |11111111111111100000000000000000| delays: 08+-07
  m1, b04: |00000000000000000111111111111110| delays: 24+-07
  m1, b05: |00000000000000000000000000000000| delays: -
  m1, b06: |00000000000000000000000000000000| delays: -
  m1, b07: |00000000000000000000000000000000| delays: -
  best: m1, b03 delays: 08+-07
Switching SDRAM to hardware control.
Memtest at 0x40000000 (2.0MiB)...
  Write: 0x40000000-0x40200000 2.0MiB
   Read: 0x40000000-0x40200000 2.0MiB
Memtest OK
Memspeed at 0x40000000 (Sequential, 2.0MiB)...
  Write speed: 27.2MiB/s
   Read speed: 27.3MiB/s

--============== Boot ==================--
Booting from serial...
Press Q or ESC to abort boot completely.
sL5DdSMmkekro
[LXTERM] Received firmware download request from the device.
[LXTERM] Uploading zephyr_eth_net_v12.bin to 0x40000000 (281420 bytes)...
[LXTERM] Upload calibration... (inter-frame: 10.00us, length: 64)
[LXTERM] Upload complete (9.9KB/s).
[LXTERM] Booting the device.
[LXTERM] Done.
Executing booted program at 0x40000000

--============= Liftoff! ===============--
*** Booting Zephyr OS build v2.7.99-2698-gf92542ecf3e4  ***
Single-threaded TCP echo server waits for a connection on port 4242...
[00:00:01.380,000] <inf> CLK_CTRL_LITEX: CLKOUT0: set rate: 100000000 HZ
[00:00:01.700,000] <inf> CLK_CTRL_LITEX: CLKOUT1: updated rate: 100000000 to 100000000 HZ
[00:00:01.960,000] <inf> CLK_CTRL_LITEX: CLKOUT0: set duty: 50%
[00:00:02.460,000] <inf> CLK_CTRL_LITEX: CLKOUT0: set phase: 12885 deg
[00:00:03.580,000] <inf> CLK_CTRL_LITEX: CLKOUT1: set rate: 100000000 HZ
[00:00:03.840,000] <inf> CLK_CTRL_LITEX: CLKOUT1: set duty: 50%
[00:00:04.340,000] <inf> CLK_CTRL_LITEX: CLKOUT1: set phase: 12885 deg
[00:00:04.340,000] <inf> CLK_CTRL_LITEX: LiteX Clock Control driver initialized
[00:00:04.350,000] <inf> net_config: Initializing network
[00:00:04.350,000] <inf> net_config: IPv4 address: 192.168.1.99
[00:00:04.350,000] <inf> net_config: Running dhcpv4 client...

uart:~$ [00:00:10.400,000] <err> eth_liteeth: TX fifo failed
[00:00:10.400,000] <err> eth_liteeth: TX fifo failed
uart:~$ [00:00:15.450,000] <err> eth_liteeth: TX fifo failed
[00:00:15.450,000] <err> eth_liteeth: TX fifo failed
uart:~$ [00:00:24.500,000] <err> eth_liteeth: TX fifo failed
[00:00:24.500,000] <err> eth_liteeth: TX fifo failed
uart:~$ [00:00:41.550,000] <err> eth_liteeth: TX fifo failed
[00:00:41.550,000] <err> eth_liteeth: TX fifo failed
uart:~$
uart:~$
uart:~$ [00:01:14.600,000] <err> eth_liteeth: TX fifo failed
[00:01:14.600,000] <err> eth_liteeth: TX fifo failed
uart:~$
uart:~$ net ping 192.168.1.131
PING 192.168.1.131
[00:01:39.650,000] <err> eth_liteeth: TX fifo failed
[00:01:42.700,000] <err> eth_liteeth: TX fifo failed
[00:01:45.750,000] <err> eth_liteeth: TX fifo failed
Ping timeout
[00:01:39.650,000] <err> eth_liteeth: TX fifo failed
[00:01:42.700,000] <err> eth_liteeth: TX fifo failed
[00:01:45.750,000] <err> eth_liteeth: TX fifo failed
uart:~$
uart:~$

The DTS overlay file used for building the zephyr.bin is as follows.

&uart0 {
        reg = <0xf0004000 0x20>;
        interrupts = <0x0 0>;
};
&timer0 {
        reg = <0xf0003800 0x40>;
        interrupts = <0x01 0>;
};
&eth0 {
        reg = <0xf0001000 0x80 0xf80000000 0x2000>;
        interrupts = <0x01 0>;
};
&spi0 {
        status = "diabled";
};
&i2c0 {
        status = "diabled";
};
&ram0 {
        reg = <0x40000000 0x40000000>;
};
&dna0 {
        reg = <0xf0002000 0x100>;
};
rlubos commented 2 years ago

It seems that the problem may lie at the device driver level, looking at the second log. System tries to send DHCPv4 requests, but the driver is not able to send the packet (from what I saw the only case TX path can fail there is when LITEETH_TX_READY register is not set by the device).

I'm not even sure who could investigate this driver, @jukkar @tbursztyka any ideas?

jukkar commented 2 years ago

. -DCONFIG_ETH_LITEETH=n

How is this working if the LITEETH is not set but

uart:~$ [00:00:10.400,000] <err> eth_liteeth: TX fifo failed
[00:00:10.400,000] <err> eth_liteeth: TX fifo failed

we are still using it? Anyway, this really looks like a driver issue, perhaps this could be assigned to original submitter (Antmicro)

Aravindh-Swaminathan commented 2 years ago

. -DCONFIG_ETH_LITEETH=n

How is this working if the LITEETH is not set but

uart:~$ [00:00:10.400,000] <err> eth_liteeth: TX fifo failed
[00:00:10.400,000] <err> eth_liteeth: TX fifo failed

we are still using it? Anyway, this really looks like a driver issue, perhaps this could be assigned to original submitter (Antmicro)

@jukkar Apologies, it was my mistake while raising the issue. -DCONFIG_ETH_LITEETH was actually set to 'y'. I had copy pasted the command from a different place and forgot to edit.

tgorochowik commented 2 years ago

We have been able to confirm the issue with TX path in liteeth in the current Zephyr tree, we'll investigate this

tgorochowik commented 2 years ago

Ok, so here is the situation. In general Litex is highly reconfigurable and actively developed. Therefore to make sure we support everything properly, we have a dedicated reference platform that can be found here: https://github.com/litex-hub/zephyr-on-litex-vexriscv

If you use the reference platform, the ethernet driver works flawlessly (with current Zephyr).

That being said, we've identified the issue, some of the registers changed in liteeth: https://github.com/enjoy-digital/liteeth/commit/89dbf17cb62f34816be2b812972fe64bc790fa95

To accommodate this change in Zephyr, you can use the following patch:

From 9afb52485be74bf9c5fa8caf747fb85de1bc7303 Mon Sep 17 00:00:00 2001
From: Mateusz Sierszulski <msierszulski@antmicro.com>
Date: Wed, 23 Mar 2022 12:38:33 +0100
Subject: [PATCH] drivers: eth: fix liteeth registers addresses

---
 drivers/ethernet/eth_liteeth.c | 19 +++++++++----------
 1 file changed, 9 insertions(+), 10 deletions(-)

diff --git a/drivers/ethernet/eth_liteeth.c b/drivers/ethernet/eth_liteeth.c
index 9d6d092446..c0b9729aad 100644
--- a/drivers/ethernet/eth_liteeth.c
+++ b/drivers/ethernet/eth_liteeth.c
@@ -39,16 +39,15 @@ LOG_MODULE_REGISTER(LOG_MODULE_NAME);
 #define LITEETH_RX_BASE        DT_INST_REG_ADDR_BY_NAME(0, control)
 #define LITEETH_RX_SLOT        ((LITEETH_RX_BASE) + 0x00)
 #define LITEETH_RX_LENGTH  ((LITEETH_RX_BASE) + 0x04)
-#define LITEETH_RX_EV_PENDING  ((LITEETH_RX_BASE) + 0x28)
-#define LITEETH_RX_EV_ENABLE   ((LITEETH_RX_BASE) + 0x2c)
-
+#define LITEETH_RX_EV_PENDING  ((LITEETH_RX_BASE) + 0x20)
+#define LITEETH_RX_EV_ENABLE   ((LITEETH_RX_BASE) + 0x24)
 /* sram - tx */
-#define LITEETH_TX_BASE        ((DT_INST_REG_ADDR_BY_NAME(0, control)) + 0x30)
-#define LITEETH_TX_START   ((LITEETH_TX_BASE) + 0x00)
-#define LITEETH_TX_READY   ((LITEETH_TX_BASE) + 0x04)
-#define LITEETH_TX_SLOT        ((LITEETH_TX_BASE) + 0x0c)
-#define LITEETH_TX_LENGTH  ((LITEETH_TX_BASE) + 0x10)
-#define LITEETH_TX_EV_PENDING  ((LITEETH_TX_BASE) + 0x1c)
+#define LITEETH_TX_BASE        DT_INST_REG_ADDR_BY_NAME(0, control)
+#define LITEETH_TX_START   ((LITEETH_TX_BASE) + 0x28)
+#define LITEETH_TX_READY   ((LITEETH_TX_BASE) + 0x2c)
+#define LITEETH_TX_SLOT        ((LITEETH_TX_BASE) + 0x34)
+#define LITEETH_TX_LENGTH  ((LITEETH_TX_BASE) + 0x38)
+#define LITEETH_TX_EV_PENDING  ((LITEETH_TX_BASE) + 0x44)

 /* irq */
 #define LITEETH_IRQ        DT_INST_IRQN(0)
@@ -132,7 +131,7 @@ static void eth_rx(const struct device *port)
    key = irq_lock();

    /* get frame's length */
-   for (int i = 0; i < 4; i++) {
+   for (int i = 0; i < 2; i++) {
        len <<= 8;
        len |= sys_read8(LITEETH_RX_LENGTH + i * 0x4);
    }
-- 
2.35.0

With this patch, you should be able to use current Zephyr with current Litex.

However this patch should not be mainlined in Zephyr until the reference platform is updated.

fkokosinski commented 2 years ago

The register change in LiteEth had been addressed by introducing the LiteX HAL in #45198, so I'm closing this one.