adafruit / Adafruit_CircuitPython_Wiznet5k

Pure-Python interface for WIZNET 5k Ethernet modules
Other
15 stars 36 forks source link

Wiznet Ethernet Hat - Will not set up DHCP #53

Open cascmptrski opened 2 years ago

cascmptrski commented 2 years ago

Using the WizNet Ethernet Hat for the Pi Pico, I am trying the example software with CircuitPython. The ping test works fine as does the fixed IP, however setting up to use DHCP, It times out never getting an address.

Turning on debug for the shows that it is sending, but fails the assertion at line 202 in adafruit_wiznet5k.py "Failed to configure DHCP Server"

The debug message prior to the failure is: * socket_available called on socket 0, protocol 2

First a question, the failure message seems to imply that the system is trying to set up a DHCP server, yet the documentation would seem to imply that the board is acting as a DHCP client. Is this just a confusing error message or should this be trying to function as a DHCP server.

I tried extending the timeout in the wiznet5k library, but that made no difference.

The basic code is as per the example:

MY_MAC = (0x00, 0x01, 0x02, 0x03, 0x04, 0x05)

# Initialize ethernet interface with DHCP
eth = WIZNET5K(spi_bus, cs, is_dhcp=True, mac=MY_MAC, debug=False)

The only change I made to this was to set debug to True with the result shown above.

Suggestions would be appreciated as to either where to look for the problem or possible workarounds.

Thanks, -CAS

anecdata commented 2 years ago

DHCP works in other environments on the WizNet Ethernet Hat, so I suspect there is some difference in the network, could even be something in-spec that isn't handled in the library. Can you post more of the debug trace, and link to the example code?

anecdata commented 2 years ago

You can also initialize with debug on and get more detailed tracing of the code: eth = WIZNET5K(spi_bus, cs, is_dhcp=True, mac=MY_MAC, debug=True)

infamy commented 2 years ago

I seem to be able to reproduce, adding the debug output.

code.py output: Hard reset... Init... My Link is: 1

Code done running.

infamy commented 2 years ago

When swapping it over to a HUB to start doing some packet captures, it seems it works on a hub but not a switch. The switch is 100/1000 auto MDIX (its an older 8 port ubiquiti with POE). So something odd seems to be happening,

If I hardcode IP address some communication seems to work, ping etc seems to work.

infamy commented 2 years ago

So this maybe related to spanning tree. disabling it on my switch seems to resolve the issue. Which means something maybe causing unwanted packet collisions and spanning tree was cutting the unit out due to the collisions.

anecdata commented 2 years ago

Similar observation here. I ordinarily have a dumb switch closest to the W5100S, which seems to mask the issue. My main switch (next hop) has "loop prevention" enabled. But If I take the dumb switch out of the equation, I get the same DHCP failure.

anecdata commented 2 years ago

From a little light reading, Spanning Tree Protocol (STP) is used when there are redundant links (intentional or accidental) between switches, to avoid broadcast loops. But even with STP enabled, there should always be one active path, so it's not clear why the DHCP broadcast doesn't pass through.

cascmptrski commented 2 years ago

I am using a newer Ubiquity 8-port switch and it is in the stock configuration. One interesting thing is that when I change the command to:

MY_MAC = (0x00, 0x01, 0x02, 0x03, 0x04, 0x05)

# Initialize ethernet interface with DHCP
eth = WIZNET5K(spi_bus, cs, is_dhcp=True, mac=MY_MAC, debug=True)

I do not get the output shown above by infamy. Instead I just get * socket_available called on socket 0, protocol 2 repeated again and again streaming down the page until it times out.

anecdata commented 1 year ago

@cascmptrski @infamy Would you mind re-testing? There have been substantial changes to this library, including DHCP handling.

infamy commented 1 year ago

Sorry for the delay, I can confirm with the updates done this is resolved.

Chip Version: w5100s MAC Address: ['0xde', '0xad', '0xbe', '0xef', '0xfe', '0xed'] My IP address is: 192.168.x.142 IP lookup adafruit.com: 104.20.38.240

At least on the W5100s. The Pico EVB board definitions in circuit python leave a lot to be desired (MISO, MOSI, SCK, CS) are not set in the baordfile even though they are hard connected to the W5100s, but that is an issue for else where.

anecdata commented 1 year ago

Thanks for checking. I'll close this for now. @cascmptrski if you are still having the issue with the latest release we can re-open.

Feel free to open an issue or PR in circuitpython to improve the board definition.

tyeth commented 1 year ago

Testing the Wiznet w5500-evb-pico Ping works, dhcp doesn't. Running with debug I get the socket message until timeout. My device is hardwired directly to a VirginMedia Hub5 (1gbps box for UK Cable internet). CircuitPython 8.2.0


Auto-reload is on. Simply save files over USB to run them or enter REPL to disable.
code.py output:
Wiznet5k Ping Test (no DHCP)
Ethernet link is down…
* Initializing DHCP
Initialising DHCP client instance.
Requesting DHCP lease.
DHCP FSM called with blocking = True
FSM initial state is 1
FSM state is INIT.
Resetting DHCP state machine.
Releasing socket.
  Socket released.
Setting up connection for DHCP.
*** Get socket.
Allocated socket # 0
+ Connection OK, port set.
Incrementing transaction ID
FSM state is SELECTING.
Processing SELECTING or REQUESTING state.
Generating DHCP message type 1
0000   01 01 06 00 42 a5 26 bd 00 00 00 00 00 00 00 00    ....B.&.........
0010   00 00 00 00 00 00 00 00 00 00 00 00 00 01 02 03    ................
0020   04 05 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................
0030   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................
0040   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................
0050   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................
0060   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................
0070   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................
0080   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................
0090   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................
00a0   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................
00b0   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................
00c0   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................
00d0   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................
00e0   00 00 00 00 00 00 00 00 00 00 00 00 63 82 53 63    ............c.Sc
00f0   35 01 01 0c 12 57 49 5a 6e 65 74 30 30 30 31 30    5....WIZnet00010
0100   32 30 33 30 34 30 35 3d 07 01 00 01 02 03 04 05    2030405=........
0110   37 03 01 03 06 33 04 00 76 a7 00 ff                7....3..v...
Calculating next retry time and incrementing retries.
Receiving a DHCP response.
socket_available called on socket 0, protocol 2
socket_available called on socket 0, protocol 2
socket_available called on socket 0, protocol 2
socket_available called on socket 0, protocol 2
socket_available called on socket 0, protocol 2
socket_available called on socket 0, protocol 2
tyeth commented 1 year ago

I now believe it's just far from reliable. I seem to get a message sometimes immediately upon the link going up, and other times not at all. Longer sleeps seem to help but effectively it's just eventually retrying or bringing the link down and up again. Not much useful to say except once it works it's reliable at getting an IP across reloads as long as power is not dropped, so I'm curious whether the STP talk above meant that the users were never able to get DHCP to work, or mostly not.

If I unplug it, then upon replugging it takes sometimes takes until the third reload (two timeouts) before I get an IP, then I get a reliable one every reload. After unplugging it for a minute, it got an IP immediately upon first connection attempt. There's more rhyme than reason. I think I will just have to program defensively when obtaining connectivity. Log of that unplug for 1minute then immediate success and simple example provided: wiznet-w5550-evb-rp2040-pico.txt

image

Just to add that watching the lights, it seems there are no packets when it times out most of the time (no orange activity light), but sometimes there are flashes of the activity light and the debug text doesn't acknowledge anything and isn't sending at the time. It would be nice if it logged all data received in debug mode even if dropped multicast packets or whatever.

anecdata commented 1 year ago

Trying EVB 5500 with CP 8.2.0 and adafruit-circuitpython-wiznet5k-py-3.0.0, DHCP seems fine (sometimes get DNS errors though). Haven't been able to reproduce so far. Running this code: https://gist.github.com/anecdata/fbc57e96c8328127b791ebaf4a1a8e8c

output... ``` code.py output: Hard reset... Init... Library Version: 3.0.0 Chip Version: w5500 MAC Address: DE:AD:BE:EF:FE:ED IP Address: 192.168.6.250 ifconfig... IP Address: 192.168.6.250 Subnet Mask: 255.255.252.0 Gateway: 192.168.4.1 DNS Server: b'\x01\x01\x01\x02' Link Status: True Fetching text from http://wifitest.adafruit.com/testwifi/index.html 200 bytearray(b'OK') {'server': 'nginx/1.18.0 (Ubuntu)', 'content-type': 'text/html', 'date': 'Thu, 20 Jul 2023 20:14:55 GMT', 'last-modified': 'Thu, 09 Dec 2021 17:26:22 GMT', 'connection': 'keep-alive', 'content-length': '69', 'etag': '"61b23c3e-45"', 'accept-ranges': 'bytes'} This is a test of Adafruit WiFi! If you can read this, its working :) 0.960999s ```

P.S. network: EVB is wired to a Netgear (PoE managed) switch, then to a larger Netgear (managed) switch, then to a Peplink router that runs the DHCP.

anecdata commented 1 year ago

Since the library has changed substantially since this issue was initially filed, and changes seemed to have corrected at least some causes, I'd suggest opening a new issue. Discussions on closed issues can get lost more easily.

anecdata commented 1 year ago

P.P.S. W5100S, W5500, and W6100 EVB-Picos all got the same result.

tyeth commented 1 year ago

It's curious, I'll take it to a new issue. The main thing I'm noticing is doing a long reset pause, then passing the reset pin to init (which does a short pause) reliably restarts the link as it's about to listen to it (i.e. init). Without the second reset / brief drop it's unreliable at DHCP. I was also getting hung up on trying to change the hostname with set_dhcp and crashing there too needlessly complicating matters (I was toggling dhcp on and off in init while set_dhcp calling).

On Thu, 20 Jul 2023 at 23:08, anecdata @.***> wrote:

P.P.S. W5100S, W5500, and W6100 EVB-Pico all got the same result.

— Reply to this email directly, view it on GitHub https://github.com/adafruit/Adafruit_CircuitPython_Wiznet5k/issues/53#issuecomment-1644679492, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABTBZ43Z6DCHV6UWNYSX4OLXRGT6FANCNFSM5PSKUHBQ . You are receiving this because you commented.Message ID: @.***>

anecdata commented 1 year ago

On second thought, after more testing, let's re-open this issue for now: the behavior seems to be network-dependent as in some of the early comments.

In my case, though I can't readily explain it. the following scenarios work:

the following scenario doesn't work *:

*

code.py output:
Hard reset...
Init...
Traceback (most recent call last):
  File "code.py", line 58, in <module>
  File "adafruit_wiznet5k/adafruit_wiznet5k.py", line 260, in __init__
  File "adafruit_wiznet5k/adafruit_wiznet5k.py", line 274, in set_dhcp
  File "adafruit_wiznet5k/adafruit_wiznet5k_dhcp.py", line 184, in request_dhcp_lease
  File "adafruit_wiznet5k/adafruit_wiznet5k_dhcp.py", line 454, in _dhcp_state_machine
  File "adafruit_wiznet5k/adafruit_wiznet5k_dhcp.py", line 381, in _handle_dhcp_message
  File "adafruit_wiznet5k/adafruit_wiznet5k_dhcp.py", line 309, in _receive_dhcp_response
TimeoutError: No DHCP response received.

This could be related, or a different issue, I'm really not sure. I've verified that there are no wiring loops in my network.

edit: I have tons of other devices with working DHCP connected like the non-working scenario above: "device --> managed switch 2 --> router". So this seems CP-specific.

tyeth commented 1 year ago

The only way I've ever had a look at such things, was to wireshark my laptop while sharing the internet over its ethernet port, but that affects things. I can swing the linux hammer at things to probe etc, but I'm no managed switch expert, especially not one who has network tracing hardware, my budget ends at hobbyist equipment. If there are useful things to do detection-wise let me know (best I could think of was checking routes from the same device on each LAN/router and then get some network promiscuity going). I'll come back to this at some point.