MEGA65 / mega65-core

MEGA65 FPGA core
Other
238 stars 84 forks source link

ETH controller stops sending packets sometimes #708

Closed ki-bo closed 11 months ago

ki-bo commented 1 year ago

Test Environment (required) You can use MEGA65INFO to retrieve this.

Describe the bug When using mega65_ftp over ethernet, the transfer works fine but sometimes the packets sent are not seen on the wire anymore. This leads to the PC re-transmitting packets in an endless loop as it doesn't get its ACK packets anymore.

With the latest version of mega65_ftp in development, there is a timeout implemented. If the PC does not get any response from the MEGA anymore, it sends a special "ETH reset request" packet and the MEGA reset the ETH controller. This seems to help a bit, but the issue occurs again after some time.

It seems the issue is in the core, as it is confirmed the data in the eth tx buffer is correct and the tx trigger is also correctly executed. The eth reset workaround is not ideal as it leads to a stall until the timeout is reached.

It is not clear whether there is not activity on the wire at all or just malformed signals that are filtered out by the connected Ethernet equipment (switch/PC). In Wireshark at least, there is no packet at all seen from the MEGA if we run into this state. RX obviously continues to work as the MEGA reports duplicate packets all over (it receives re-transmissions for packets it already acknowledged before).

To Reproduce Steps to reproduce the behavior:

  1. Use mega65_ftp with ethernet from latest development branch
  2. Transfer files to the SD card until you run into the hanging state
  3. See the PC trying to reset the eth controller by sending reset requests after some seconds with no reply from the MEGA

Expected behavior All packets should be sent out as expected

Screenshots If applicable, add screenshots to help explain your problem.

Additional context Add any other context about the problem here.

ki-bo commented 1 year ago

Potentially fixed by this commit: https://github.com/MEGA65/mega65-core/commit/23b2368dfc34e63c13e66b3c00d58d435bb30616

ki-bo commented 11 months ago

Haven't seen this bug after heavy usage in the last weeks, and also don't see reports of it happening at the moment. As the bug appeared very randomly and needs external Ethernet data transfers I won't write a test case for it. The wrong behaviour can't be triggered by known reproduction steps.

Closing this one.