Open jeolwang opened 2 years ago
@jeolwang Do you see this error on Zephyr 3.2? I ask because I believe that https://github.com/zephyrproject-rtos/zephyr/pull/48752 might have solved this issue
I have the same issue on Zephyr 3.2 and Zephyr 3.3. Is there any progress? Can someone tell me what is or should be triggering the error?
@IbeVdV @dleach02 I am still experiencing this issue on Zephyr 3.4. Also my ethernet performance is really poor, no matter how I configure the driver.
I've sent a request to nxp for assistance. I'll post here if I found anything out.
Let me know if you have suggestions.
@jameswalmsley-cpi , We have not seen this issue, and would like to replicate it. Are you able to provide the details needed to replicate? Preferably on an NXP evaluation board. What sample shows the issue? Are there any modifications needed to the sample to expose the issue? Thanks
@DerekSnell I have the mimxrt1064_evk board. I shall try to reproduce on this board.
So far I can build the following (from zephyr/main
):
west build -b mimxrt1064_evk zephyr/samples/net/zperf/
zephyr-shell
zperf udp download
linux
iperf -V -u -c fe80::4:9fff:fe39:3ca4%enp0s20f0u1u1
iperf -V -u -c fe80::4:9fff:fe39:3ca4%enp0s20f0u1u1
------------------------------------------------------------
Client connecting to fe80::4:9fff:fe39:3ca4%enp0s20f0u1u1, UDP port 5001
Sending 1450 byte datagrams, IPG target: 11062.62 us (kalman adjust)
UDP buffer size: 208 KByte (default)
------------------------------------------------------------
[ 1] local fe80::935:d1d6:7fda:bd83 port 49230 connected with fe80::4:9fff:fe39:3ca4 port 5001
[ ID] Interval Transfer Bandwidth
[ 1] 0.0000-10.0119 sec 1.25 MBytes 1.05 Mbits/sec
[ 1] Sent 906 datagrams
read failed: Connection refused
read failed: Connection refused
read failed: Connection refused
The speed seems very slow. I have tried to change many settings like dtcm, hardware acceleration etc, and I always get a very similar result of 1.05 Mbits/sec. I get the same result on our board too.
I will try to reproduce the GetRxFrameSize issue now.
Hi @jameswalmsley , Thanks for sharing this. We will test it out and see if we find similar results.
@DerekSnell I have created a PR to fix some issues with the driver. Perhaps you can help me get it into shape :) https://github.com/zephyrproject-rtos/zephyr/pull/60073
The ENET_GetRxFrameSize() errors came from the eth_mcux driver only supporting the REFCLK being generated by the RT1064 (as in the ref-board).
I have added some changes in #60073 to enable the REFCLK as input, and support configuration of both 25MHz / 50MHz crystals on the PHYs.
I've also added other changes to disable cache maintenance in the HAL driver when DTCM is used for all buffers.
Unfortunately I was not able to find the source of the performance issue.
Going to reopen this issue, as it is clearly still an issue on the platform.
Hi @jameswalmsley , It sounds like you resolved the ENET_GetRxFrameSize() errors with the REFCLK changes in your PR https://github.com/zephyrproject-rtos/zephyr/pull/60073. Since the ENET_GetRxFrameSize() errors were the original problem reported in this Issue, and you are still seeing poor performance, I created a separate Issue https://github.com/zephyrproject-rtos/zephyr/issues/60144 to continue tracking the performance problems. In case your current PR closes this Issue. Thanks for all your contributions
Hi @jameswalmsley , For some reason, GitHub will not let me @mention you on https://github.com/zephyrproject-rtos/zephyr/issues/60144, but it will let me mention you in this issue.
Based on this comment that resolves the performance issue, does using the latest main
resolve your performance issue?
This issue has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this issue will automatically be closed in 14 days. Note, that you can always re-open a closed issue at any time.
This issue has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this issue will automatically be closed in 14 days. Note, that you can always re-open a closed issue at any time.
This issue has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this issue will automatically be closed in 14 days. Note, that you can always re-open a closed issue at any time.
please also try with the new driver
I am also experiencing this issue.. I am using the new NXP "experimental" driver, with a few modifications.
We are running Zephyr 3.6.0 and have an NXP 1176 on a custom board. Also, instead of a PHY we are using the following 5-port managed Ethernet switch from Microchip (KSZ8775CLX, https://ww1.microchip.com/downloads/en/DeviceDoc/00002129C.pdf).
In our architecture, we have the 1176 MAC connected to the SW5 RMII MAC on the switch. So, I removed the PHY initialization piece of the driver (since it is mac-to-mac communication). The Ethernet switch is using an external 25 MHz crystal.
Still getting the following error, and a LOT of dropped packets (in the realm of 20-30% dropped over UDP).
[00:00:04.148,000] <err> eth_nxp_enet_mac: eth_nxp_enet_rx: ENET_GetRxFrameSize return: 4001
One comment / clarification. I did try running my software on the NXP MIMXRT_1170_EVK development board (obviously I had to put the PHY initialization back in the code), and I have absolutely no issues.. I don't drop packets, and I don't get any error messages on the console.
Is there something I am missing in a "MAC-to-MAC" configuration, or do you think I might have a hardware issue?
For anyone reading this, we resolved our issue.
Turns out, the 1176 was driving the 50 MHz ENET REF clock for RMII mode (and, the Microchip switch was also driving the ENET REF clock.. so, we had a contention issue). Once the 1176 IOMUX GPR4 register was configured to accept the REF clock as an input, errors went away and Ethernet is completely functional.
since multiple people who had these errors found that changing the refclk configuration fixes it, I'm going to convert this to an enhancement request to add configurability of the refclk. @jeolwang if you find you have a different issue let us know
System version zephyr3.1.0, hardware mimxrt1064_evk, run DHCP routine, PING test is unstable. In addition, after adding socket communication logic, socket communication was unstable and slow. After receiving and sending a few KB of data, the network seemed to be unavailable, and PING failed
The following is the device running log: