Closed pfalcon closed 5 years ago
As additional info, I tried to vary different things. At one point e1000e device was actually hosed, so switching back to frdm_k64f it didn't work, and there was a weird error in dmesg. That caused me some confusion, but after rebooting the laptop, I have a clear reproducible picture that connected frdm_k64f works, while mimxrt1050_evb - doesn't.
I tried another Eth cable too ;-).
Hello @pfalcon I'm using a VM to develop and test. I have a USB-Ethernet adapter that is assigned to the VM as eth1 (eth0 - NAT to host). I'm using the adapter to connect the board to the VM. I'm testing testing dumb_http_server and it does not work. Looks like Ethernet driver is not correctly initialized in i.MX RT in current baseline. I'll fix the bug and test dumb_http_server (i need to setup DNS and NAT on eth1).
@agansari: Thanks for the info and confirmation, looking towards a fix,
I'll fix the bug and test dumb_http_server (i need to setup DNS and NAT on eth1).
Note that dumb_http_server doesn't access Internet in any way, so DNS or NAT should not be needed. dumb_http_server is well, a simple web server. If everything runs ok, you can just access it from a desktop browser using http://192.0.2.1:8080/ . I use Apache Bench (ab) tool on Linux, because I usually test that a big number of HTTP requests can be handled without anything go wrong (e.g. 1000 requests in row).
Hello, today I made a rebase on the latest firmware and ethernet works as expected, initilization goes okay, leds blink as they should and also ran the benchmark:
This is ApacheBench, Version 2.3 <$Revision: 1807734 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking 192.0.2.1 (be patient)
Completed 100 requests
Completed 200 requests
Completed 300 requests
Completed 400 requests
Completed 500 requests
Completed 600 requests
Completed 700 requests
Completed 800 requests
Completed 900 requests
Completed 1000 requests
Finished 1000 requests
Server Software:
Server Hostname: 192.0.2.1
Server Port: 8080
Document Path: /
Document Length: 2122 bytes
Concurrency Level: 1
Time taken for tests: 5.996 seconds
Complete requests: 1000
Failed requests: 0
Total transferred: 2181000 bytes
HTML transferred: 2122000 bytes
Requests per second: 166.77 [#/sec] (mean)
Time per request: 5.996 [ms] (mean)
Time per request: 5.996 [ms] (mean, across all concurrent requests)
Transfer rate: 355.21 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 1 0.2 1 3
Processing: 4 5 0.4 5 8
Waiting: 3 5 0.4 5 8
Total: 4 6 0.3 6 9
Percentage of the requests served within a certain time (ms)
50% 6
66% 6
75% 6
80% 6
90% 6
95% 6
98% 7
99% 7
100% 9 (longest request)
Can you have a go @pfalcon ? Latest commit I tested on is: 2d6a226b2e59c52449b2d87add8fe3061e83bdb9
@agansari: Thanks for the update and detailed info. Not sure I'll get to it today, but will test it, thanks for the exact rev to use, to be on the same line.
@agansari: I'm afraid it also does not work for me. I based on commit 2d6a226, built the dumb_http_server and there is no Ethernet activity, both leds are permanently on, no blinking. It does not generate any response to ARP broadcasts for its IP address. If I build and run the echo_server sample, this shows the same behaviour, however it sends a broadcast ARP request when I ping out from the board using its uart net console (it ignores the response from the target though).
After some dubugging it seems that ENET_ReceiveIRQHandler() is never called from eth_mcux_dispacher_isr() in drivers/ethernet/eth_mcux.c, although when I ping out, ENET_TransmitIRQHandler() is called, so it looks as if the common EMAC interrupt is set up OK but not being activated for received frames.
Not sure where to go from here, finding it a bit difficult to fully understand the code.
I have confirmed that the hardware is OK by loading a MCUXPresso UDP echo server demo project which works fine.
@jeremy-e-mills have you cleaned the build folder workspace after rebase? Sounds like the old behavior.
cd ~/zephyr/samples/net/sockets/dumb_http_server/build cd .. && rm -rf build && mkdir build && cd build cmake -GNinja -DBOARD=mimxrt1050_evk .. ninja debug
Works on my setup, if there are more issues, there maybe something with my setup.
@agansari: Yes, I cloned a fresh repository and also am building out of the zephyr tree as am using Eclipse for debug builds. I have deleted my build directories a few times and started again during the investigation. The same behaviour is observed with normal and Eclipse project builds, running XIP from hyperflash and SRAM.
If it's my setup then I'm currently out of ideas. The samples run fine on the K64 board.
@jeremy-e-mills I only tested with code in TCM memory, don't think XIP works at all. Does your code reach: mimxrt1050_evk_init() ?
@agansari: Yes, it does and I have stepped through the pinmux set up for the EMAC connections. Sorry, should not perhaps have confused things by mentioning XIP. When I want this I cheat by building for XIP in hyperflash with a 0x2000 text section offset. I then replace the first 0x2000 bytes with the contents of another file containing an XIP header binary file that I created by stripping the first 0x2000 bytes from an XIP MCUXPresso project. Had to do this to run the echo server demo with additional debug enabled as when these are enabled it won't fit into the default 128KB code optimised ram, my actual debugging has been from RAM using Eclipse.
@jeremy-e-mills: Just wanted to mention that I find your comments insightful, thanks for mentioned XIP and otherwise giving detailed info on steps you take. I'll try to join the debugging fun as soon as I can, feel a bit tired today, but hope to get to it tomorrow.
@pfalcon: Hi, glad to have someone else looking into this. I should have contributed earlier but have been a bit reluctant as I'm a newbie to zephyr. It's all very strange. From debug output the PHY looks happy, it negotiates 100M duplex and can see the Ethernet cable removed and replaced and can send ARP packets, but just does not see any incoming traffic, with no led activity. It's so mad I have been thinking about my build environment but have ruled every thing out (I think!). Will get onto it again tomorrow.
Latest commit I tested on is: 2d6a226
Ok, tested this now, and get the same picture as I described above: my laptop's Ethernet link active LED doesn't light up, i.e. it doesn't think there's a connection carrier.
@jeremy-e-mills : What about you in this case? Can you confirm that the peer sees cable connection between itself and the board? And can you describe your test setup?
So, what I'm doing is testing using samples/net/echo_server with ipv6 disabled. I then enabled:
+CONFIG_ETHERNET_LOG_LEVEL_DBG=y
+CONFIG_ETH_MCUX_PHY_EXTRA_DEBUG=y
And I saw log output below. And then suddenly I noticed that link LED on laptop is up! Still no pings though.
uart:~$ ***** Booting Zephyr OS zephyr-v1.13.0-2149-g2d6a226b2e *****
[00:00:00.031,856] <dbg> eth_mcux.eth_0_init: MAC 00:04:9f:49:38:be
[00:00:00.031,856] <dbg> eth_mcux.eth_mcux_phy_start: phy_state=initial
[00:00:00.031,909] <dbg> eth_mcux.eth_mcux_phy_event: phy_state=initial
[00:00:00.031,960] <dbg> eth_mcux.eth_mcux_phy_event: phy_state=reset
[00:00:00.032,013] <dbg> eth_mcux.eth_mcux_phy_event: phy_state=autoneg
[00:00:00.032,066] <dbg> eth_mcux.eth_mcux_phy_event: phy_state=restart
[00:00:00.032,117] <dbg> eth_mcux.eth_mcux_phy_event: phy_state=read-status
[00:00:01.101,444] <dbg> eth_mcux.eth_mcux_phy_event: phy_state=wait
[00:00:01.101,495] <dbg> eth_mcux.eth_mcux_phy_event: phy_state=read-status
[00:00:02.110,005] <dbg> eth_mcux.eth_mcux_phy_event: phy_state=wait
[00:00:02.110,056] <dbg> eth_mcux.eth_mcux_phy_event: phy_state=read-status
[00:00:03.120,006] <dbg> eth_mcux.eth_mcux_phy_event: phy_state=wait
[00:00:03.120,057] <dbg> eth_mcux.eth_mcux_phy_event: phy_state=read-status
[00:00:04.130,005] <dbg> eth_mcux.eth_mcux_phy_event: phy_state=wait
[00:00:04.130,058] <dbg> eth_mcux.eth_mcux_phy_event: phy_state=read-status
[00:00:04.130,111] <dbg> eth_mcux.eth_mcux_phy_event: phy_state=read-duplex
[00:00:04.130,111] <inf> eth_mcux.eth_mcux_phy_event: Enabled 100M full-duplex mode.
[00:00:05.140,006] <dbg> eth_mcux.eth_mcux_phy_event: phy_state=wait
[00:00:05.140,059] <dbg> eth_mcux.eth_mcux_phy_event: phy_state=read-status
[00:00:06.150,007] <dbg> eth_mcux.eth_mcux_phy_event: phy_state=wait
[00:00:06.150,060] <dbg> eth_mcux.eth_mcux_phy_event: phy_state=read-status
[00:00:00.001,723] <dbg> eth_mcux.eth_mcux_phy_event: phy_state=wait
Btw, this is the output I get:
[00:00:06.150,060] <dbg> eth_mcux.eth_mcux_phy_event: phy_state=read-status
[00:00:00.001,723] <dbg> eth_mcux.eth_mcux_phy_event: phy_state=wait
I would imagine there's a timing source bug somewhere, can you look into that, @agansari ?
@agansari : Btw, this else-if ladder doesn't look too right to me, what if there're multiple interrupts to serve, why make it call ISR again instead of serving all in one go?
if (EIR & (kENET_RxBufferInterrupt | kENET_RxFrameInterrupt)) {
ENET_ReceiveIRQHandler(ENET, &context->enet_handle);
} else if (EIR & (kENET_TxBufferInterrupt | kENET_TxFrameInterrupt)) {
ENET_TransmitIRQHandler(ENET, &context->enet_handle);
} else if (EIR & ENET_EIR_MII_MASK) {
k_work_submit(&context->phy_work);
ENET_ClearInterruptStatus(ENET, kENET_MiiInterrupt);
} else if (EIR) {
ENET_ClearInterruptStatus(ENET, 0xFFFFFFFF);
}
eth_mcux_dispacher_isr() in drivers/ethernet/eth_mcux.c, although when I ping out, ENET_TransmitIRQHandler() is called, so it looks as if the common EMAC interrupt is set up OK but not being activated for received frames.
I confirm that I see this behavior too.
I confirm that I see this behavior too.
And I confirm that I see ARP request from imx in wireshark on the host side, but of imx doesn't see the reply back:
84 1037.105739249 Freescal_49:38:be Broadcast ARP 60 Who has 192.0.2.2? Tell 192.0.2.1
85 1037.105757409 WistronI_c8:94:35 Freescal_49:38:be ARP 42 192.0.2.2 is at 3c:97:0e:c8:94:35
So, link status goes up and down erratically for me - during startup of new debugging session, and not too often. For example, just got it down and can't recover so far.
Was able to recover. So again, just works erratically w/o too clear pattern.
The problem is with the PHY setup. I disabled the PHY reset in mimxrt1050_evk_init() and re-built. I then flashed into hyperflash a working Ethernet MCUXPresso project, confirmed LED activity and that it responded to pings. I then debugged the echo server sample project (which now no longer resets the PHY at start). The Ethernet now functions correctly, led activity ARP responses OK etc.
So, two things:- The configuration of the PHY at startup does not break it if it was previously working. The configuration of the PHY from its reset state is not working.
@agansari: Why do you set GPI01.10 to be an output in mimxrt1050_evk_init()? It's the INTRP output from the PHY.
@jeremy-e-mills GPIO1.10 - ENET_INT from PHY to MCU (GPIO_AD_B0_10) GPIO1.9 - ENET_RST from MCU to PHY (GPIO_AD_B0_09)
Let me understand, you dont pull the reset now and the PHY will work if it was previously correctly enabled by another firmware? If you run a demo that does not enable PHY before running sample project, does it still work? I'll try this out as well, use MCUXPresso demos that do enable/disable PHY and then run dumb http sample.
@jeremy-e-mills i've found out why we have different results.
Scenario 1: power up, enet demo in hyperflash (txrx_trasfer) turns eth on, zephyr sample on iTCM used configured phy and works. Scenario 2: power up, default demo in hyperflash (bubbles i think) does not turn eth on, zephyr sample on iTCM uses unconfigured phy and does not work.
I think this is why my results, yours, @pfalcon were different, I had a working phy when i powered the board. Looks like PHY reset does not clear configuration if board was already powered and configured; i ran default build. Do your tests confirm this?
@jeremy-e-mills, @agansari: Idea that something which runs before a Zephyr app affects/interferes with PHY setup seems plausible.
FWIW, My SW7 DIP is set as: 0101. That's not the default which was set after I received board, I played with switches while I was trying to get mbed bootloader's USB mass-storage mode work.
I was a bit mistaken earlier. I still have the working ethernet demo code in hyperflash, so the ethernet is active on boot. I re-enabled the PHY reset in the zephyr sample and ran it from iTCM and the ethernet still works.
Now will remove the code from hyperflash, power cycle board and run the zephyr code again. I expect that the ethernet will not work.
As expected, with blinky running out of hyperflash on boot, PHY does not work with subsequent ethernet iTCM code.
@jeremy-e-mills: Good investigation work. So I guess, now it's time to think what's wrong with PHY initialization on mimx.
You guys have access to mcuxpresso samples and probably can compare stuff. I however would like to pose a question to @agansari: Why do we have #ifdef'ed differences in PHY initialization and handling on frdm_k64f vs mimxrt1050_evk, given that PHYs are the same?
@jeremy-e-mills thank you for testing and debugging the issue it showed what's wrong with the current build.
@pfalcon partially yes, look for KINETIS and IMX defines in eth_mcux.c Even if the part is similar they are wired differently, i.e. in Kinetis the clock source is an xtal while i.MX is the clock source for the phy. Kinetis runs mostly on auto, while i.MX requires more PHY configuration, and this is what's missing from my pull request. PHY configuration work in progress...
@agansari:
Ok, let's look at specifics: https://github.com/zephyrproject-rtos/zephyr/blob/master/drivers/ethernet/eth_mcux.c#L269
Why do you request read, but don't read value? Can MII controller get upset about that? Then the write is apparently clock source difference you're talking about.
But if they have different clksrc, they both still need to have it set up. Let's not rely on the default config, let's initialize stuff explicitly, let's not have conditionals on those paths, which lead to errors. The only conditional we'd have is the actual initialization value to use for one vs another board.
@pfalcon @jeremy-e-mills hello, after investigating what is wrongly initialized in the PHY, finally found some shared pins that are wrongly initialized. Pushed a fix, more details in the commit message.
Also when testing in my VM+usb-eth adapter the result of the benchmark varies greatly depending on VM usb configuration. If you're not testing on a VM don't think it's an issue.
@agansari Hi, I have tested the fix and I confirm that the Ethernet now functions at startup. After a bit of debugging I notice that the root cause of the issue was that GPIO_AD_B0_10 was being held low because the pinmux had not been setup for that pin. As I said earlier in this thread, I do not know why you are driving this line high as it is pulled up by R309 which provides the high state required when the PHY starts up. After start up this functions as the ENET_INT output from the phy (which we don't use as the driver is polling the registers for the phy status). As a test, I modified the code so that GPIO_AD_B0_10 is configured as input and commented out the GPIO_WritePinOutput high on that pin and the Ethernet starts up OK. It's fine to leave things as they are but we should be aware that if the phy interrupt is ever to be used, then this port will have to be reconfigured as an input.
@agansari A further thought, I would think that if the phy ever drives this line (INTRP/) low and the processor is holding it high, then there is the possibility that its output FET will be damaged.
@jeremy-e-mills Thank you for looking into this issue. Yes, PHY's INTRP is never used, we never set 1Bh register inside the KSZ8081RNB PHY, so it's always starts with interrupts disabled. Both GPIO_AD_B0_9/10 are set as output. PHY's INTRP is also NAND_TREE# that is activated when pulled down. MCU and PHY's pins also have internal pull-up resistors. When testing as input, have you powered down the board with a clean program memory (i remember you used flash programming) so first pull-up on PHY is never runned with gpio as input code?
@agansari Yes, my code with the port as an input has been running in hyperflash and the Ethernet starts successfully after a power cycle. My point is that the imx processor port should be configured as an input and driving it high is unnecessary and conflicts with the mode of the PHY pin (INTRP/) after its reset is complete.
@jeremy-e-mills Yes, you are right, I've also tested with the pin set as input today, it works... but I wouldn't make this change for maintainability purposes (shared header files with MCUXpresso and their demos). We should make this change only if we ever implement PHY interrupt mechanisms (PHY register 1Bh is currently not implemented in phy's header).
@agansari OK, thanks for the work, I consider this issue fixed.
One other thing not related to this issue, but is relevant to your IMX ethernet work:
I am currently running the gptp sample but in order to build it I found that I had to make a couple of modifications so that the mcux PTP timestamping specific code builds for the IMX processor, as follows:-
zephyr_library_compile_definitions_ifdef( CONFIG_PTP_CLOCK_MCUX ENET_ENHANCEDBUFFERDESCRIPTOR_MODE
into ext/hal/nxp/mcux/drivers/imx/CMakeLists.txt (as is done in ext/hal/nxp/mcux/drivers/kinetis/CMakeLists.txt for K64 processor)
into soc/arm/nxp_imx/rt/dts_fixup.h (similar to same #define in soc/arm/nxp_kinetis/k6x/dts_fixup.h for k64 proc)
I don't know if you want to make these changes as part of this issue push or if another bug should be raised for these changes.
Thanks.
So, testing https://github.com/zephyrproject-rtos/zephyr/pull/11882. Adding comments here, because well, it doesn't work OOB still, so more fixes would be required. So far, I didn't have eth cable connected, started echo_server via gdb, and then plugged in the cable. The connection isn't detected for mimx or the host.
So, looks like erratic link up/link down situation as described in https://github.com/zephyrproject-rtos/zephyr/issues/11586#issuecomment-442196399, etc. are still there.
Nothing helps so far. But I was reproduce 2 times in row following on app start:
***** MPU FAULT *****
Data Access Violation
MMFAR Address: 0x77f1
***** Hardware exception *****
Current thread ID = 0x2000363c
Faulting instruction address = 0x171d6
Fatal fault in ISR! Spinning...
But I was reproduce 2 times in row following on app start:
Well, that started when I tried different eth cable. Seemed stable, until I disconnected that cable and connected it again.
Overall, 10 tries - stop debugging, start again, power on board with cable connected, etc, - nothing helped so far to get activity LED on laptop side (seemed to help previously).
Well, that started when I tried different eth cable. Seemed stable, until I disconnected that cable and connected it again.
Yeah, seems to be well reproducible/repeatable, with the 1st the cable too.
Now I really start to think that all these problems I experience is due to insufficient power of microUSB (already tried a different USB cable). Will see if I'm able to find a suitable power supply with barrel connector.
Ok, so what I did, is took external USB power supply (Amazon Kindle one, known to be pretty good), and connected to J9 microUSB connector on the board, which seems to be the dedicated microUSB for just power input. Before that, switch J1 DIP to 3-4 position. All that of course following IMXRT1050EVKBHUG.pdf .
Nothing. The link indicator is dead on my laptop.
Swapping in frdm_k64f with all the same cable just work at once.
Huh, I reverted to master (from https://github.com/zephyrproject-rtos/zephyr/pull/11882). LED is up at once!
Ping don't work.
Summary: https://github.com/zephyrproject-rtos/zephyr/pull/11882 deterministically held link down for me.
So, now I got the link up. Let's see if #11882 will be able to put it down. Well, yeah, the link goes down actually when GDB starts to upload the code to the board, and stays down.
@jeremy-e-mills I know about gPTP on this board, but forgot to create an issue, so please create an issue, dts_fixup needs a different macro, I need to look a bit into what changes need to be made. Had a similar issue o K64F.
@pfalcon haven't tested link-up/down on this board, let me see what the issue is.
@pfalcon i've tested 3 samples: dumb_http_server, echo_client and echo_server
I've rebased on the latest commits and code size is smaller, so i can run echo_ samples off iTCM. dumb/echo_client work fine, i could pull cable in/out, link works okay, ping also works, power cycled to board serveral times, works okay.
echo_Server does not initialize okay, link leds remain light up, didn't get to link-up/down.
Can you test echo_Client... if you have some spare time?
@agansari: These latest tests, against which code they were done.
As for me, let me just summarize the situation:
So, it doesn't make sense to test sample like echo_client - physical level of Ethernet connection doesn't work for me.
I'm not sure how to get forward with this. The only way I see is to target 100% reproduction of the setup, down to each jumper configuration, down to type of each connection, cable, power supply, etc.
@pfalcon Well, it's something other than PHY initialization after the patch. PHY works on my side after patch in dumb_http_server sample, but looks like you have trouble with echo_server only, is this the case?
Going further with with what i observed, looking at diffs between echo_server and client i've narrowed it down to CONFIG_NET_CONTEXT_NET_PKT_POOL, something breaks the driver/stack when this macro is set.
@agansari
Well, it's something other than PHY initialization after the patch. PHY works on my side after patch in dumb_http_server sample, but looks like you have trouble with echo_server only, is this the case?
Gotcha, you want me to see if something else may be involved. Started this week with digging my development backlog, before holidays suddenly strike, but testing the above in my queue for this week. Thanks.
Describe the bug This is continuation of the discussion at (merged/closed) https://github.com/zephyrproject-rtos/zephyr/pull/10875#issuecomment-440625614 .
I cannot get Ethernet connection to BOARD=mimxrt1050_evb. When I connect a patchcord between the mimxrt1050_evb and my laptop, I don't get "link active" LED light up on my laptop side, i.e. it behaves as if the cable wasn't connected. Of course, network interface in Linux doesn't have "RUNNING" status in ifconfig. As discussed at the link above, on mimxrt1050_evb, both Etherjack LEDs are lighted up, and never blink.
This same setup works as expected with frdm_k64f. I.e. if I just switch mimxrt1050_evb with frdm_k64f, leaving the same USB and Ethernet cables, it works, switching back it doesn't, again frdm_k64f - works, back to mimxrt1050_evb - doesn't.
To Reproduce
I'm using dumb_http_server as a reference sample to run.
Environment (please complete the following information): My laptop is Thinkpad X230, with e1000e driver for Ethernet, Ubuntu 18.04:
@agansari, Can you please describe you test setup in detail, i.e. what is connected where, etc.