acooks / tn40xx-driver

Linux driver for tn40xx from Tehuti Networks
72 stars 50 forks source link

rxd_err = 0x4 #8

Closed adrianq closed 5 years ago

adrianq commented 5 years ago

I have a PEX10000SFP and, although the installation was successful, I keep getting a 'rxd_err = 0x4' in the kernel log. A speed test shows it is properly installed but this error keeps happening. Any ideas?

acooks commented 5 years ago

In the source it looks like 0x4 indicates an error in the Ethernet Frame Check Sequence. In other words, it's a CRC error and meant to indicate data corruption. These packets are dropped and you'll be able to measure the scale of the problem with something like iperf that reports packet loss. Please record and share your results.

It could be quite difficult to track down the cause. It could be a cabling issue, or a firmware bug, or a dodgy board layout or manufacturing, or marginal memory, or a number of other things, including a driver bug.

Infinality commented 5 years ago

I will get this error occasionally in the log too (a handful of times over a few days). Overall it still works though. I'm using cat5 cable from ~2003, so it's possible that is contributing to it.

adrianq commented 5 years ago

I will record and share the results again (I am not physically there and for now we are using the regular 1Mb network card). When we recorded, it looks the network speed was most of the times the expected one or really close to it. However, for us, it spits out these traces more than 30 times per sec. We double check the cable was properly connected as it was our first thought. Finally we decided to disconnect the cables as we were a bit unsure of the error. These are the cables and transceptor that we are currently using. They should work with this driver, right?. Anyway, will investigate a bit more and come back to you. Thanks for your comments!

adrianq commented 5 years ago

@acooks @Infinality We have executed an iperf test and this is what we are currently getting, a very asymmetric connection. We keep investigating but in case you can shed some light on this.

root@torvalds:~# iperf -c torvalds
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[  3] local 150.244.87.57 port 57778 connected with 150.244.87.42 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-14.1 sec   356 KBytes   207 Kbits/sec
root@torvalds:~# iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[  4] local 150.244.87.57 port 5001 connected with 150.244.87.42 port 45126
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0-10.0 sec  7.59 GBytes  6.51 Gbits/sec

Now with UDP and tracking package loss:

 iperf -su -i1
------------------------------------------------------------
Server listening on UDP port 5001
Receiving 1470 byte datagrams
UDP buffer size:  208 KByte (default)
------------------------------------------------------------
[  3] local 150.244.87.42 port 5001 connected with 150.244.87.57 port 34699
[ ID] Interval       Transfer     Bandwidth        Jitter   Lost/Total Datagrams
[  3]  0.0- 1.0 sec  89.0 KBytes   729 Kbits/sec   0.026 ms   38/  100 (38%)
[  3]  1.0- 2.0 sec  84.7 KBytes   694 Kbits/sec   0.020 ms   25/   84 (30%)
[  3]  2.0- 3.0 sec  87.6 KBytes   717 Kbits/sec   0.024 ms   26/   87 (30%)
[  3]  3.0- 4.0 sec  81.8 KBytes   670 Kbits/sec   0.021 ms   32/   89 (36%)
[  3]  4.0- 5.0 sec  74.6 KBytes   612 Kbits/sec   0.023 ms   37/   89 (42%)
[  3]  5.0- 6.0 sec  91.9 KBytes   753 Kbits/sec   0.022 ms   27/   91 (30%)
[  3]  6.0- 7.0 sec  84.7 KBytes   694 Kbits/sec   0.029 ms   29/   88 (33%)
[  3]  7.0- 8.0 sec  76.1 KBytes   623 Kbits/sec   0.019 ms   37/   90 (41%)
[  3]  8.0- 9.0 sec  90.4 KBytes   741 Kbits/sec   0.019 ms   25/   88 (28%)
iperf -su -i 1
------------------------------------------------------------
Server listening on UDP port 5001
Receiving 1470 byte datagrams
UDP buffer size:  208 KByte (default)
------------------------------------------------------------
[  3] local 150.244.87.57 port 5001 connected with 150.244.87.42 port 55830
[ ID] Interval       Transfer     Bandwidth        Jitter   Lost/Total Datagrams
[  3]  0.0- 1.0 sec   129 KBytes  1.06 Mbits/sec   0.003 ms    0/   90 (0%)
[  3]  1.0- 2.0 sec   128 KBytes  1.05 Mbits/sec   0.005 ms    0/   89 (0%)
[  3]  2.0- 3.0 sec   128 KBytes  1.05 Mbits/sec   0.003 ms    0/   89 (0%)
[  3]  3.0- 4.0 sec   128 KBytes  1.05 Mbits/sec   0.007 ms    0/   89 (0%)
[  3]  4.0- 5.0 sec   128 KBytes  1.05 Mbits/sec   0.004 ms    0/   89 (0%)
[  3]  5.0- 6.0 sec   129 KBytes  1.06 Mbits/sec   0.004 ms    0/   90 (0%)
[  3]  6.0- 7.0 sec   128 KBytes  1.05 Mbits/sec   0.003 ms    0/   89 (0%)
[  3]  7.0- 8.0 sec   128 KBytes  1.05 Mbits/sec   0.003 ms    0/   89 (0%)
[  3]  8.0- 9.0 sec   128 KBytes  1.05 Mbits/sec   0.006 ms    0/   89 (0%)
[  3]  9.0-10.0 sec   128 KBytes  1.05 Mbits/sec   0.005 ms    0/   89 (0%)
adrianq commented 5 years ago

I am closing this issue as the problem was because of us not reading the specifications properly :/ We were trying to use the card in a Kernel 4.x but it was actually not supported. It works up to 3.x. I will write to Tehuti Networks and ask if they have any plans to support this Kernel (I guess the answer is no but...). If anybody manages to make it work and can drop a comment here it would be great...