acooks / tn40xx-driver

Linux driver for tn40xx from Tehuti Networks
71 stars 50 forks source link

Driver or card crashes with VLAN traffic #23

Open jp-bennett opened 4 years ago

jp-bennett commented 4 years ago

I'm working on the Trendnet TEG-10GECTX adapter that uses this driver, and it uses the x3310fw file. I've gotten everything working, but seeing an odd problem. I have two servers running CentOS 8, both using this card/driver. I have IP addresses assigned to the interface, and traffic flows as normal.

I set up a bridge on the two servers, pointing at a vlan interface running over the cards. Sending any vlan traffic over the cards causes something to crash on the card or in the driver. I'm not seeing any error messages in dmesg or the logs, but all traffic across the interface suddenly stops.

Judging by watching pings, the connection crashes the moment vlan tagged traffic is sent across the wire.

jp-bennett commented 4 years ago

A bit more information, I'm seeing this when using libvirt virtual machines, and connecting VMs to the bridges on the two different servers, and trying to forward traffic across. I managed to get everything working as expected, with one of the machines running a driver with debug mode turned on. Recompiling to get rid of the dmesg spam actually re-introduced the bug again.

It might be related to the vnet adapters that libvirtd creates when I attach a VM to the bridge, as it seems to be VM traffic that kills it most reliably.

gmazzamuto commented 3 years ago

I am experiencing the same problem. The connection is lost as soon as some tagged packet is sent across the wire. When the card is connected to an untagged port on the switch, everything works. If I configure the switch port to accept another tagged vlan (in addition to the default untagged traffic), the connection crashes after a few seconds (I imagine as soon as the first tagged packet arrives). If I undo this change, i.e. reconfigure the switch port to accept only untagged traffic, the connection stays broken, suggesting that something has indeed crashed in the driver. The only way to make it work again is by doing:

ip link set enp139s0 down
ip link set enp139s0 up