snabbco / snabb

Snabb: Simple and fast packet networking
Apache License 2.0
2.96k stars 298 forks source link

Intel 82571GB Gigabit Controller Support #34

Closed pkazmier closed 11 years ago

pkazmier commented 11 years ago

Support the Intel 82571GB Gigabit Controller.

My new NIC arrives today I believe. It may take me a little time to figure this out as I'm really starting behind the eight-ball, but I'm excited to hack around!

Here is my development environment (KVM box and home networking setup):

https://www.evernote.com/shard/s167/sh/bf598a18-dd86-46cb-9469-24b893763749/234c1f88e7527253c47f164e00cae259

I guess I'll have to figure out how to expose the new NIC directly to the guest as well.

pkazmier commented 11 years ago

I added more detail to my development environment and home network setup in case anyone cares :-)

lukego commented 11 years ago

Cool :-)

The first step is probably to recognize your card by its PCI vendor/device id numbers (pci.lua). Then if you are really lucky it might just work. Otherwise you will need to find out what significant differences there are between your controller and the supported one. The Intel 'igb' driver in Linux is one place to look for this information and the data sheets should be good for reference. You can also use 'ethtool -d' to get a register dump from the card when the OS driver is controlling it to compare values. And of course feedback from github issues is always at the ready :-)

Driver hacking is interesting. You could be lucky and the card works really quickly. Or you could spend a really long time banging your head against what turns out to be a maddeningly obvious problem. I spent about a month banging my head against making DMA work on the ethernet chip before finally realizing it was turned off at the PCI level :).

lukego commented 11 years ago

btw one thing that strikes me about the output you get from snabbswitch is that it's so much "-" in the table. I had expected it would recognise you have a NIC called e.g. "eth0" even if this isn't usable by the switch. Could be that the code in pci.lua for scanning /sys/bus/pci/devices/* isn't working as expected on your machine. (Have I overestimated how widely supported that sysfs directory is, for example?)

pkazmier commented 11 years ago

I thought it was odd as well. It seems the virtio network interface in my guest does not include a net directory in /sys/bus/pci/devices/.../, which means the interface is nil in the table returned by device_info(). Here is some more output from my system:

kaz@monad:/sys/bus/pci/devices/0000:00:03.0$ lspci
00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] (rev 02)
00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II]
00:01.1 IDE interface: Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton II]
00:01.2 USB controller: Intel Corporation 82371SB PIIX3 USB [Natoma/Triton II] (rev 01)
00:01.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 03)
00:02.0 VGA compatible controller: Cirrus Logic GD 5446
00:03.0 Ethernet controller: Red Hat, Inc Virtio network device
00:04.0 SCSI storage controller: Red Hat, Inc Virtio block device
00:05.0 RAM memory: Red Hat, Inc Virtio memory balloon

kaz@monad:/sys/bus/pci/devices/0000:00:03.0$ lspci -v -s 00:03.0
00:03.0 Ethernet controller: Red Hat, Inc Virtio network device
        Subsystem: Red Hat, Inc Device 0001
        Physical Slot: 3
        Flags: bus master, fast devsel, latency 0, IRQ 10
        I/O ports at c060 [size=32]
        Memory at febf1000 (32-bit, non-prefetchable) [size=4K]
        Expansion ROM at febe0000 [disabled] [size=64K]
        Capabilities: <access denied>
        Kernel driver in use: virtio-pci

kaz@monad:/sys/bus/pci/devices/0000:00:03.0$ ls /sys/bus/pci/devices/0000\:00\:03.0
broken_parity_status      device         firmware_node  modalias   remove     resource1         subsystem_vendor
class                     dma_mask_bits  irq            msi_bus    rescan     rom               uevent
config                    driver         local_cpulist  numa_node  resource   subsystem         vendor
consistent_dma_mask_bits  enable         local_cpus     power      resource0  subsystem_device  virtio0
kaz@monad:/sys/bus/pci/devices/0000:00:03.0$ 
pkazmier commented 11 years ago

Moving onwards ... my new NIC arrived and has been installed!

Unfortunately, I'm unable to pass the NIC directly through to my guests because I have a Gigibyte motherboard that does not support VT-d (at least according to what I've gathered from the KVM and Xen pages). Instead, I'll simply run snabbswitch on my KVM host directly. On a side note, I was under the impression that the snabbswitch binary was to be all inclusive based on your firmware blog post, but it would not run by itself as it was missing dependencies, so I just ended up rsync'ing my whole snabbswitch directory to the host (not a big deal).

The Intel ports are eth2 and eth3 on my system. Before running the test, I prepared and validated my capture environment for troubleshooting. With eth2 bound, configured, and up on my KVM host, I connected the port to my Apple Airport (wifi hub), fired up Wireshark on my iMac's wireless NIC, and configured Wireshark with a capture filter for frames from eth2. I confirmed that I was able to see traffic from the NIC.

With the test environment ready, I modified the device ID in pci.lua to 0x105e and let her rip. Here is what I saw (spoiler, not much, but that just means I'll have something to hack around with so that's actually good news!)

root@world:/home/kaz/src/snabbswitch# src/snabbswitch 
selftest: memory
Kernel HugeTLB pages (/proc/sys/vm/nr_hugepages): 0
  Allocating a 2MB HugeTLB: Got 2MB at 0x02000000
  Allocating a 2MB HugeTLB: Got 2MB at 0x02200000
  Allocating a 2MB HugeTLB: Got 2MB at 0x01c00000
  Allocating a 2MB HugeTLB: Got 2MB at 0x36800000
Kernel HugeTLB pages (/proc/sys/vm/nr_hugepages): 4
HugeTLB page allocation OK.
selftest: pci
Scanning PCI devices:
pciaddr         vendor  device  iface   status
0000:00:00.0    0x8086  0x2e30  -       -
0000:00:01.0    0x8086  0x2e31  -       -
0000:00:02.0    0x8086  0x2e32  -       -
0000:00:1b.0    0x8086  0x27d8  -       -
0000:00:1c.0    0x8086  0x27d0  -       -
0000:00:1c.1    0x8086  0x27d2  -       -
0000:00:1d.0    0x8086  0x27c8  -       -
0000:00:1d.1    0x8086  0x27c9  -       -
0000:00:1d.2    0x8086  0x27ca  -       -
0000:00:1d.3    0x8086  0x27cb  -       -
0000:00:1d.7    0x8086  0x27cc  -       -
0000:00:1e.0    0x8086  0x244e  -       -
0000:00:1f.0    0x8086  0x27b8  -       -
0000:00:1f.2    0x8086  0x27c0  -       -
0000:00:1f.3    0x8086  0x27da  -       -
0000:01:00.0    0x8086  0x105e  -       -
0000:01:00.1    0x8086  0x105e  eth3    down
0000:02:00.0    0x10ec  0x8168  eth0    up
0000:03:00.0    0x10ec  0x8168  eth1    up
Suitable devices: 
  0000:01:00.0
  0000:01:00.1
selftest: intel device 0000:01:00.0
NIC transmit test
intel selftest: pciaddr=0000:01:00.0 secs=1
Waiting for linkup.............. ok
Generating traffic for 1 second(s)...
^C
root@world:/home/kaz/src/snabbswitch# 

Nothing was emitted from the card according to my Wireshark captures, which means my next steps are to review intel.lua and the Intel documentation for the 82571 card. Here is a link if anyone is looking for it: http://developer.intel.com/content/dam/www/public/us/en/documents/manuals/pcie-gbe-controllers-open-source-manual.pdf. That documentation is all foreign to me, so this should be interesting!

The wife is calling ... time to go for now ...

pkazmier commented 11 years ago

For future reference, here is the output of ethtool for the NIC in question:

# ethtool -d eth2
MAC Registers
-------------
0x00000: CTRL (Device control register)  0x000C0241
      Endian mode (buffers):             little
      Link reset:                        normal
      Set link up:                       1
      Invert Loss-Of-Signal:             no
      Receive flow control:              disabled
      Transmit flow control:             disabled
      VLAN mode:                         disabled
      Auto speed detect:                 disabled
      Speed select:                      1000Mb/s
      Force speed:                       no
      Force duplex:                      no
0x00008: STATUS (Device status register) 0x00080380
      Duplex:                            half
      Link up:                           no link config
      TBI mode:                          disabled
      Link speed:                        1000Mb/s
      Bus type:                          PCI Express
      Port number:                       0
0x00100: RCTL (Receive control register) 0x00000000
      Receiver:                          disabled
      Store bad packets:                 disabled
      Unicast promiscuous:               disabled
      Multicast promiscuous:             disabled
      Long packet:                       disabled
      Descriptor minimum threshold size: 1/2
      Broadcast accept mode:             ignore
      VLAN filter:                       disabled
      Canonical form indicator:          disabled
      Discard pause frames:              filtered
      Pass MAC control frames:           don't pass
      Receive buffer size:               2048
0x02808: RDLEN (Receive desc length)     0x00000000
0x02810: RDH   (Receive desc head)       0x00000000
0x02818: RDT   (Receive desc tail)       0x00000000
0x02820: RDTR  (Receive delay timer)     0x00000000
0x00400: TCTL (Transmit ctrl register)   0x30000008
      Transmitter:                       disabled
      Pad short packets:                 enabled
      Software XOFF Transmission:        disabled
      Re-transmit on late collision:     disabled
0x03808: TDLEN (Transmit desc length)    0x00000000
0x03810: TDH   (Transmit desc head)      0x00000000
0x03818: TDT   (Transmit desc tail)      0x00000000
0x03820: TIDV  (Transmit delay timer)    0x00000000
PHY type:                                unknown
root@world:/home/kaz/src/snabbswitch# 
pkazmier commented 11 years ago

Quick question: I noticed phy_lock() and phy_unlock() functions, but they are never used. Are they supposed to be?

lukego commented 11 years ago

Yes, phy_read() and phy_write() should really be using them.

On 13 January 2013 09:03, pkazmier notifications@github.com wrote:

Quick question: I noticed phy_lock() and phy_unlock() functions, but they are never used. Are they supposed to be?

— Reply to this email directly or view it on GitHubhttps://github.com/SnabbCo/snabbswitch/issues/34#issuecomment-12190840.

lukego commented 11 years ago

Pete you have these user accounts now to access the existing test lab for reference:

ssh -p 54322 pkazmier@arbon.snabb.co # development KVM instance ssh pkazmier@bern.snabb.co # has eth1 cabled to the development instance, can be useful for tcpdump etc password "snabbswitch" but please change that (not that these machines are sensitive in any way). also when you run snabbswitch please do it like this:

flock -x /tmp/snabb.lock ./snabbswitch ...

to avoid colliding with me and Rahul on the same machine :)

pkazmier commented 11 years ago

Thanks and will do. Password changed. Time for me to go to bed (2:30am here in Dallas, TX).

On Jan 13, 2013, at 2:13 AM, Luke Gorrie notifications@github.com wrote:

Pete you have these user accounts now to access the existing test lab for reference:

ssh -p 54322 pkazmier@arbon.snabb.co # development KVM instance ssh pkazmier@bern.snabb.co # has eth1 cabled to the development instance, can be useful for tcpdump etc password "snabbswitch" but please change that (not that these machines are sensitive in any way). also when you run snabbswitch please do it like this:

flock -x /tmp/snabb.lock ./snabbswitch ...

to avoid colliding with me and Rahul on the same machine :)

— Reply to this email directly or view it on GitHub.

lukego commented 11 years ago

Good stuff!

I created issue #37 for the static linking problem, good catch.

Time to go snowboarding here.. 9:30am in Grindelwald, Switzerland.

lukego commented 11 years ago

Pete, regarding PCIe setup for DMA, the key thing to check is that "lspci -v" shows 'bus master' in 'Flags'. That's what the PCIe config setup code is there to ensure.

It would sure be handy if ^C would generate a Lua backtrace eh!

pkazmier commented 11 years ago

Sorry ... pressed the wrong button! I'm with rahul ... where was my "are you sure you want to close this issue?" prompt.

pkazmier commented 11 years ago

Re: PCIe setup for DMA, I'm all set now. I was confused before. I do see the flag being set correctly for mastering and it is confirmed with lspci -v.

I'm slowly making my way through code and documentation. I discovered that I'll need to use SWSM.SWESMBI to lock PHY instead of using EXTCNF_CTRL.MDIO like you do for the 82574. I also believe the PHY RESET process is also different for my board than yours according to the documentation I have, so I was going to fix that next.

Fun stuff when your driving blind!

lukego commented 11 years ago

Howdy Pete!

I need a distraction from fighting the kernel on memory access and HugeTLBs. Is there something useful I can do to assist on the 82571GB? (do you have a dump_status() that I could stare at with you for example?)

pkazmier commented 11 years ago

Hi Luke,

Sorry, haven't had much time to do any coding (full-time job, wife, exercise), but tomorrow morning before work I hope to see if I can force link to go down on my card to validate that snabb is talking to the hardware (plus I can visually see the port). As you can see below, after the "Waiting for linkup....." message, I added a print_status and it shows the link is down. Clearly, one of them is incorrect, so that's where I was going to start. During my lunch time tomorrow, I'll also hack around as well (I'm working from home tomorrow so I have Internet access - at work it's blocked). Here is the dump you wanted to see. I added the print_status before the test loop and after it. Plus, I still get that weird stack trace on every run at the end. If you want an account on the box, I can set one up for you, but we just have to keep in mind it's the KVM host unfortunately that feeds my apartment network connectivity.

root@world:/home/kaz# ./snabbswitch 
selftest: memory
Kernel HugeTLB pages (/proc/sys/vm/nr_hugepages): 8
  Allocating a 2MB HugeTLB: Got 2MB at 0x10000000
  Allocating a 2MB HugeTLB: Got 2MB at 0x10200000
  Allocating a 2MB HugeTLB: Got 2MB at 0x10400000
  Allocating a 2MB HugeTLB: Got 2MB at 0x10600000
Kernel HugeTLB pages (/proc/sys/vm/nr_hugepages): 8
HugeTLB page allocation OK.
selftest: pci
Scanning PCI devices:
pciaddr         vendor  device  iface   status
0000:00:00.0    0x8086  0x2e30  -       -
0000:00:01.0    0x8086  0x2e31  -       -
0000:00:02.0    0x8086  0x2e32  -       -
0000:00:1b.0    0x8086  0x27d8  -       -
0000:00:1c.0    0x8086  0x27d0  -       -
0000:00:1c.1    0x8086  0x27d2  -       -
0000:00:1d.0    0x8086  0x27c8  -       -
0000:00:1d.1    0x8086  0x27c9  -       -
0000:00:1d.2    0x8086  0x27ca  -       -
0000:00:1d.3    0x8086  0x27cb  -       -
0000:00:1d.7    0x8086  0x27cc  -       -
0000:00:1e.0    0x8086  0x244e  -       -
0000:00:1f.0    0x8086  0x27b8  -       -
0000:00:1f.2    0x8086  0x27c0  -       -
0000:00:1f.3    0x8086  0x27da  -       -
0000:01:00.0    0x8086  0x105e  -       -
0000:01:00.1    0x8086  0x105e  -       -
0000:02:00.0    0x10ec  0x8168  eth0    up
0000:03:00.0    0x10ec  0x8168  eth1    up
Suitable devices: 
  0000:01:00.0
  0000:01:00.1
selftest: intel device 0000:01:00.0
NIC transmit test
intel selftest: pciaddr=0000:01:00.0 secs=1
Waiting for linkup.............. ok
>>> DEBUG: After linkup wait, before execution of traffic
MAC status
  STATUS      = 00080381
  Full Duplex = yes
  Link Up     = no
  PHYRA       = no
  Speed       = 1000 Mb/s
Transmit status
  TCTL        = 3103f0fa
  TXDCTL      = 01410000
  TX Enable   = yes
  TDH         = 0
  TDT         = 0
  TDBAH       = 00000000
  TDBAL       = 10880000
  TDLEN       = 524288
  TARC        = 00000403
  TIPG        = 00602006
Receive status
  RCTL        = 0603803a
  RXDCTL      = 01010000
  RX Enable   = yes
  RX Loopback = no
  RDH         = 0
  RDT         = 0
  RDBAH       = 00000000
  RDBAL       = 10800000
  RDLEN       = 524288                                                                                                                  [92/1864]
  RADV        = 10
PHY status
  Autonegotiate state    = complete
  Remote fault detection = no remote fault detected
  Copper Link Status     = copper link is down
  Speed and duplex resolved = no
  Speed                  = 1000Mb/s
  Duplex                 = half-duplex
  Advertise 1000 Mb/s FD = yes
  Advertise 1000 Mb/s HD = no
  Advertise  100 Mb/s FD = yes
  Advertise  100 Mb/s HD = yes
  Advertise   10 Mb/s FD = yes
  Advertise   10 Mb/s HD = yes
  Partner   1000 Mb/s FD = yes
  Partner   1000 Mb/s HD = no
  Partner    100 Mb/s FD = yes
  Partner    100 Mb/s HD = yes
  Partner     10 Mb/s FD = yes
  Partner     10 Mb/s HD = yes
Generating traffic for 1 second(s)...
Statistics for PCI device 0000:01:00.0:
>>> DEBUG: After test, leaving selftest
MAC status
  STATUS      = 00080381
  Full Duplex = yes
  Link Up     = no
  PHYRA       = no
  Speed       = 1000 Mb/s
Transmit status
  TCTL        = 3103f0fa
  TXDCTL      = 01410000
  TX Enable   = yes
  TDH         = 214
  TDT         = 213
  TDBAH       = 00000000
  TDBAL       = 10880000
  TDLEN       = 524288
  TARC        = 00000403
  TIPG        = 00602006
Receive status
  RCTL        = 0603803a
  RXDCTL      = 01010000
  RX Enable   = yes
  RX Loopback = no
  RDH         = 0
  RDT         = 0
  RDBAH       = 00000000
  RDBAL       = 10800000
  RDLEN       = 524288
  RADV        = 10
PHY status
  Autonegotiate state    = complete
  Remote fault detection = no remote fault detected
  Copper Link Status     = copper link is down
  Speed and duplex resolved = no
  Speed                  = 1000Mb/s
  Duplex                 = half-duplex
  Advertise 1000 Mb/s FD = yes
  Advertise 1000 Mb/s HD = no
  Advertise  100 Mb/s FD = yes
  Advertise  100 Mb/s HD = yes
  Advertise   10 Mb/s FD = yes
  Advertise   10 Mb/s HD = yes
  Partner   1000 Mb/s FD = yes
  Partner   1000 Mb/s HD = no
  Partner    100 Mb/s FD = yes
  Partner    100 Mb/s HD = yes
  Partner     10 Mb/s FD = yes
  Partner     10 Mb/s HD = yes
NIC transmit+receive loopback test
intel selftest: pciaddr=0000:01:00.0 secs=1 receive=true loopback=true
Waiting for linkup............. ok
>>> DEBUG: After linkup wait, before execution of traffic
MAC status
  STATUS      = 00080381
  Full Duplex = yes
  Link Up     = no
  PHYRA       = no
  Speed       = 1000 Mb/s
Transmit status
  TCTL        = 3103f0fa
  TXDCTL      = 01410000
  TX Enable   = yes
  TDH         = 0
  TDT         = 0
  TDBAH       = 00000000
  TDBAL       = 10b80000
  TDLEN       = 524288
  TARC        = 00000403
  TIPG        = 00602006
Receive status
  RCTL        = 0603807a
  RXDCTL      = 01010000
  RX Enable   = yes
  RX Loopback = yes
  RDH         = 0
  RDT         = 0
  RDBAH       = 00000000
  RDBAL       = 10b00000
  RDLEN       = 524288
  RADV        = 10
PHY status
  Autonegotiate state    = complete
  Remote fault detection = no remote fault detected
  Copper Link Status     = copper link is down
  Speed and duplex resolved = yes
  Speed                  = 1000Mb/s
  Duplex                 = half-duplex
  Advertise 1000 Mb/s FD = yes
  Advertise 1000 Mb/s HD = no
  Advertise  100 Mb/s FD = yes
  Advertise  100 Mb/s HD = yes
  Advertise   10 Mb/s FD = yes
  Advertise   10 Mb/s HD = yes
  Partner   1000 Mb/s FD = yes
  Partner   1000 Mb/s HD = no
  Partner    100 Mb/s FD = yes
  Partner    100 Mb/s HD = yes
  Partner     10 Mb/s FD = yes
  Partner     10 Mb/s HD = yes
Generating traffic for 1 second(s)...
Statistics for PCI device 0000:01:00.0:
>>> DEBUG: After test, leaving selftest
MAC status
  STATUS      = 00080381
  Full Duplex = yes
  Link Up     = no
  PHYRA       = no
  Speed       = 1000 Mb/s
Transmit status
  TCTL        = 3103f0fa
  TXDCTL      = 01410000
  TX Enable   = yes
  TDH         = 214
  TDT         = 213
  TDBAH       = 00000000
  TDBAL       = 10b80000
  TDLEN       = 524288
  TARC        = 00000403
  TIPG        = 00602006
Receive status
  RCTL        = 0603807a
  RXDCTL      = 01010000
  RX Enable   = yes
  RX Loopback = yes
  RDH         = 0
  RDT         = 32761
  RDBAH       = 00000000
  RDBAL       = 10b00000
  RDLEN       = 524288
  RADV        = 10
PHY status
  Autonegotiate state    = complete
  Remote fault detection = no remote fault detected
  Copper Link Status     = copper link is down
  Speed and duplex resolved = yes
  Speed                  = 1000Mb/s
  Duplex                 = half-duplex
  Advertise 1000 Mb/s FD = yes
  Advertise 1000 Mb/s HD = no
  Advertise  100 Mb/s FD = yes
  Advertise  100 Mb/s HD = yes
  Advertise   10 Mb/s FD = yes
  Advertise   10 Mb/s HD = yes
  Partner   1000 Mb/s FD = yes
  Partner   1000 Mb/s HD = no
  Partner    100 Mb/s FD = yes
  Partner    100 Mb/s HD = yes
  Partner     10 Mb/s FD = yes
  Partner     10 Mb/s HD = yes
selftest: intel device 0000:01:00.1
intel.lua:261: attempt to redefine 'rx_desc' at line 2
stack traceback:
        main.lua:17: in function <main.lua:15>
        [C]: in function 'cdef'
        intel.lua:261: in function 'new'
        selftest.lua:20: in main chunk
        [C]: in function 'require'
        main.lua:11: in function <main.lua:3>
        [C]: in function 'xpcall'
        main.lua:22: in main chunk
        [C]: in function 'require'
        [string "require "main""]:1: in main chunk
root@world:/home/kaz# 
pkazmier commented 11 years ago

Progress!! Now link up is consistent and I'm now seeing stats for the first time! Running late for work so didn't get a chance to actually look at any of this output, but thought I'd share progress.

root@world:/home/kaz# ./snabbswitch 
selftest: memory
Kernel HugeTLB pages (/proc/sys/vm/nr_hugepages): 8
  Allocating a 2MB HugeTLB: Got 2MB at 0x10000000
  Allocating a 2MB HugeTLB: Got 2MB at 0x10200000
  Allocating a 2MB HugeTLB: Got 2MB at 0x10400000
  Allocating a 2MB HugeTLB: Got 2MB at 0x10600000
Kernel HugeTLB pages (/proc/sys/vm/nr_hugepages): 8
HugeTLB page allocation OK.
selftest: pci
Scanning PCI devices:
pciaddr     vendor  device  iface   status
0000:00:00.0    0x8086  0x2e30  -   -
0000:00:01.0    0x8086  0x2e31  -   -
0000:00:02.0    0x8086  0x2e32  -   -
0000:00:1b.0    0x8086  0x27d8  -   -
0000:00:1c.0    0x8086  0x27d0  -   -
0000:00:1c.1    0x8086  0x27d2  -   -
0000:00:1d.0    0x8086  0x27c8  -   -
0000:00:1d.1    0x8086  0x27c9  -   -
0000:00:1d.2    0x8086  0x27ca  -   -
0000:00:1d.3    0x8086  0x27cb  -   -
0000:00:1d.7    0x8086  0x27cc  -   -
0000:00:1e.0    0x8086  0x244e  -   -
0000:00:1f.0    0x8086  0x27b8  -   -
0000:00:1f.2    0x8086  0x27c0  -   -
0000:00:1f.3    0x8086  0x27da  -   -
0000:01:00.0    0x8086  0x105e  -   -
0000:01:00.1    0x8086  0x105e  -   -
0000:02:00.0    0x10ec  0x8168  eth0    up
0000:03:00.0    0x10ec  0x8168  eth1    up
Suitable devices: 
  0000:01:00.0
  0000:01:00.1
selftest: intel device 0000:01:00.0
NIC transmit test
intel selftest: pciaddr=0000:01:00.0 secs=1
Waiting for linkup.............. ok
>>> DEBUG: After linkup wait, before execution of traffic
MAC status
  STATUS      = 00080383
  Full Duplex = yes
  Link Up     = yes
  PHYRA       = no
  Speed       = 1000 Mb/s
Transmit status
  TCTL        = 3103f0fa
  TXDCTL      = 01410000
  TX Enable   = yes
  TDH         = 0
  TDT         = 0
  TDBAH       = 00000000
  TDBAL       = 10880000
  TDLEN       = 524288
  TARC        = 00000403
  TIPG        = 00602006
Receive status
  RCTL        = 0603803a
  RXDCTL      = 01010000
  RX Enable   = yes
  RX Loopback = no
  RDH         = 0
  RDT         = 0
  RDBAH       = 00000000
  RDBAL       = 10800000
  RDLEN       = 524288
  RADV        = 10
PHY status
  Autonegotiate state    = complete
  Remote fault detection = no remote fault detected
  Copper Link Status     = copper link is down
  Speed and duplex resolved = no
  Speed                  = 1000Mb/s
  Duplex                 = half-duplex
  Advertise 1000 Mb/s FD = yes
  Advertise 1000 Mb/s HD = no
  Advertise  100 Mb/s FD = yes
  Advertise  100 Mb/s HD = yes
  Advertise   10 Mb/s FD = yes
  Advertise   10 Mb/s HD = yes
  Partner   1000 Mb/s FD = yes
  Partner   1000 Mb/s HD = no
  Partner    100 Mb/s FD = yes
  Partner    100 Mb/s HD = yes
  Partner     10 Mb/s FD = yes
  Partner     10 Mb/s HD = yes
Generating traffic for 1 second(s)...
Statistics for PCI device 0000:01:00.0:
                  61 PRC64      Packets Received [64 Bytes] Count
                   1 PRC511     Packets Received [256-511 Bytes] Count
                   1 PRC1023    Packets Received [512-1023 Bytes] Count
                  63 GPRC       Good Packets Received Count
                   2 BPRC       Broadcast Packets Received Count
                  61 MPRC       Multicast Packets Received Count
           1,536,393 GPTC       Good Packets Transmitted Count
               4,881 GORCL      Good Octets Received Count
          98,329,536 GOTCL      Good Octets Transmitted Count
                  63 RNBC       Receive No Buffers Count
               4,881 TORL       Total Octets Received (Low)
          98,330,816 TOTL       Total Octets Transmitted (Low)
                  63 TPR        Total Packets Received
           1,536,423 TPT        Total Packets Transmitted
           1,536,425 PTC64      Packets Transmitted [64 Bytes] Count
>>> DEBUG: After test, leaving selftest
MAC status
  STATUS      = 00080383
  Full Duplex = yes
  Link Up     = yes
  PHYRA       = no
  Speed       = 1000 Mb/s
Transmit status
  TCTL        = 3103f0fa
  TXDCTL      = 01410000
  TX Enable   = yes
  TDH         = 29760
  TDT         = 28797
  TDBAH       = 00000000
  TDBAL       = 10880000
  TDLEN       = 524288
  TARC        = 00000403
  TIPG        = 00602006
Receive status
  RCTL        = 0603803a
  RXDCTL      = 01010000
  RX Enable   = yes
  RX Loopback = no
  RDH         = 0
  RDT         = 0
  RDBAH       = 00000000
  RDBAL       = 10800000
  RDLEN       = 524288
  RADV        = 10
PHY status
  Autonegotiate state    = complete
  Remote fault detection = no remote fault detected
  Copper Link Status     = copper link is down
  Speed and duplex resolved = no
  Speed                  = 1000Mb/s
  Duplex                 = full-duplex
  Advertise 1000 Mb/s FD = yes
  Advertise 1000 Mb/s HD = no
  Advertise  100 Mb/s FD = yes
  Advertise  100 Mb/s HD = yes
  Advertise   10 Mb/s FD = yes
  Advertise   10 Mb/s HD = yes
  Partner   1000 Mb/s FD = yes
  Partner   1000 Mb/s HD = no
  Partner    100 Mb/s FD = yes
  Partner    100 Mb/s HD = yes
  Partner     10 Mb/s FD = yes
  Partner     10 Mb/s HD = yes
NIC transmit+receive loopback test
intel selftest: pciaddr=0000:01:00.0 secs=1 receive=true loopback=true
Waiting for linkup.............. ok
>>> DEBUG: After linkup wait, before execution of traffic
MAC status
  STATUS      = 00080383
  Full Duplex = yes
  Link Up     = yes
  PHYRA       = no
  Speed       = 1000 Mb/s
Transmit status
  TCTL        = 3103f0fa
  TXDCTL      = 01410000
  TX Enable   = yes
  TDH         = 0
  TDT         = 0
  TDBAH       = 00000000
  TDBAL       = 10b80000
  TDLEN       = 524288
  TARC        = 00000403
  TIPG        = 00602006
Receive status
  RCTL        = 0603807a
  RXDCTL      = 01010000
  RX Enable   = yes
  RX Loopback = yes
  RDH         = 0
  RDT         = 0
  RDBAH       = 00000000
  RDBAL       = 10b00000
  RDLEN       = 524288
  RADV        = 10
PHY status
  Autonegotiate state    = complete
  Remote fault detection = no remote fault detected
  Copper Link Status     = copper link is down
  Speed and duplex resolved = no
  Speed                  = 1000Mb/s
  Duplex                 = half-duplex
  Advertise 1000 Mb/s FD = yes
  Advertise 1000 Mb/s HD = no
  Advertise  100 Mb/s FD = yes
  Advertise  100 Mb/s HD = yes
  Advertise   10 Mb/s FD = yes
  Advertise   10 Mb/s HD = yes
  Partner   1000 Mb/s FD = yes
  Partner   1000 Mb/s HD = no
  Partner    100 Mb/s FD = yes
  Partner    100 Mb/s HD = yes
  Partner     10 Mb/s FD = yes
  Partner     10 Mb/s HD = yes
Generating traffic for 1 second(s)...
Statistics for PCI device 0000:01:00.0:
             727,122 MPC        Missed Packets Count
             799,605 PRC64      Packets Received [64 Bytes] Count
             799,613 GPRC       Good Packets Received Count
           1,526,740 GPTC       Good Packets Transmitted Count
          51,175,616 GORCL      Good Octets Received Count
          97,711,616 GOTCL      Good Octets Transmitted Count
              22,967 RNBC       Receive No Buffers Count
          97,712,640 TORL       Total Octets Received (Low)
          97,712,832 TOTL       Total Octets Transmitted (Low)
           1,526,765 TPR        Total Packets Received
           1,526,766 TPT        Total Packets Transmitted
           1,526,767 PTC64      Packets Transmitted [64 Bytes] Count
>>> DEBUG: After test, leaving selftest
MAC status
  STATUS      = 00080383
  Full Duplex = yes
  Link Up     = yes
  PHYRA       = no
  Speed       = 1000 Mb/s
Transmit status
  TCTL        = 3103f0fa
  TXDCTL      = 01410000
  TX Enable   = yes
  TDH         = 20430
  TDT         = 19283
  TDBAH       = 00000000
  TDBAL       = 10b80000
  TDLEN       = 524288
  TARC        = 00000403
  TIPG        = 00602006
Receive status
  RCTL        = 0603807a
  RXDCTL      = 01010000
  RX Enable   = yes
  RX Loopback = yes
  RDH         = 13060
  RDT         = 13060
  RDBAH       = 00000000
  RDBAL       = 10b00000
  RDLEN       = 524288
  RADV        = 10
PHY status
  Autonegotiate state    = complete
  Remote fault detection = no remote fault detected
  Copper Link Status     = copper link is down
  Speed and duplex resolved = no
  Speed                  = 1000Mb/s
  Duplex                 = half-duplex
  Advertise 1000 Mb/s FD = yes
  Advertise 1000 Mb/s HD = no
  Advertise  100 Mb/s FD = yes
  Advertise  100 Mb/s HD = yes
  Advertise   10 Mb/s FD = yes
  Advertise   10 Mb/s HD = yes
  Partner   1000 Mb/s FD = yes
  Partner   1000 Mb/s HD = no
  Partner    100 Mb/s FD = yes
  Partner    100 Mb/s HD = yes
  Partner     10 Mb/s FD = yes
  Partner     10 Mb/s HD = yes
selftest: intel device 0000:01:00.1
intel.lua:265: attempt to redefine 'rx_desc' at line 2
stack traceback:
    main.lua:17: in function <main.lua:15>
    [C]: in function 'cdef'
    intel.lua:265: in function 'new'
    selftest.lua:20: in main chunk
    [C]: in function 'require'
    main.lua:11: in function <main.lua:3>
    [C]: in function 'xpcall'
    main.lua:22: in main chunk
    [C]: in function 'require'
    [string "require "main""]:1: in main chunk
pkazmier commented 11 years ago

Here are the changes I made thus far in case you are interested: https://github.com/pkazmier/snabbswitch/commit/19091e679a34799f1184e7b5023d0f8a7c721239

pkazmier commented 11 years ago

Fixed the printing of duplex settings for the 82571 card. The PHY port status (17) is different on the 82571, which would result in apparently random duplex settings in the PHY section of print_status: https://github.com/pkazmier/snabbswitch/commit/c1863ffbef4d28f8184857a153ff3f92744d48e8

Here is the output now (and I confirmed that I see the frames being sent on the wire via tshark)

kaz@world:~$ sudo ./snabbswitch 
selftest: memory
Kernel HugeTLB pages (/proc/sys/vm/nr_hugepages): 8
  Allocating a 2MB HugeTLB: Got 2MB at 0x10000000
  Allocating a 2MB HugeTLB: Got 2MB at 0x10200000
  Allocating a 2MB HugeTLB: Got 2MB at 0x10400000
  Allocating a 2MB HugeTLB: Got 2MB at 0x10600000
Kernel HugeTLB pages (/proc/sys/vm/nr_hugepages): 8
HugeTLB page allocation OK.
selftest: pci
Scanning PCI devices:
pciaddr     vendor  device  iface   status
0000:00:00.0    0x8086  0x2e30  -   -
0000:00:01.0    0x8086  0x2e31  -   -
0000:00:02.0    0x8086  0x2e32  -   -
0000:00:1b.0    0x8086  0x27d8  -   -
0000:00:1c.0    0x8086  0x27d0  -   -
0000:00:1c.1    0x8086  0x27d2  -   -
0000:00:1d.0    0x8086  0x27c8  -   -
0000:00:1d.1    0x8086  0x27c9  -   -
0000:00:1d.2    0x8086  0x27ca  -   -
0000:00:1d.3    0x8086  0x27cb  -   -
0000:00:1d.7    0x8086  0x27cc  -   -
0000:00:1e.0    0x8086  0x244e  -   -
0000:00:1f.0    0x8086  0x27b8  -   -
0000:00:1f.2    0x8086  0x27c0  -   -
0000:00:1f.3    0x8086  0x27da  -   -
0000:01:00.0    0x8086  0x105e  -   -
0000:01:00.1    0x8086  0x105e  eth3    up
0000:02:00.0    0x10ec  0x8168  eth0    up
0000:03:00.0    0x10ec  0x8168  eth1    up
Suitable devices: 
  0000:01:00.0
selftest: intel device 0000:01:00.0
NIC transmit test
intel selftest: pciaddr=0000:01:00.0 secs=1
Waiting for linkup............ ok
Generating traffic for 1 second(s)...
Statistics for PCI device 0000:01:00.0:
           1,531,838 GPTC       Good Packets Transmitted Count
          98,037,952 GOTCL      Good Octets Transmitted Count
          98,039,872 TOTL       Total Octets Transmitted (Low)
           1,531,877 TPT        Total Packets Transmitted
           1,531,879 PTC64      Packets Transmitted [64 Bytes] Count
>>> DEBUG: After test, leaving selftest
MAC status
  STATUS      = 00080383
  Full Duplex = yes
  Link Up     = yes
  PHYRA       = no
  Speed       = 1000 Mb/s
Transmit status
  TCTL        = 3103f0fa
  TXDCTL      = 01410000
  TX Enable   = yes
  TDH         = 25577
  TDT         = 24374
  TDBAH       = 00000000
  TDBAL       = 10880000
  TDLEN       = 524288
  TARC        = 00000403
  TIPG        = 00602006
Receive status
  RCTL        = 0603803a
  RXDCTL      = 01010000
  RX Enable   = yes
  RX Loopback = no
  RDH         = 0
  RDT         = 0
  RDBAH       = 00000000
  RDBAL       = 10800000
  RDLEN       = 524288
  RADV        = 10
PHY status
  Autonegotiate state    = complete
  Remote fault detection = no remote fault detected
  Copper Link Status     = copper link is up
  Speed                  = 1000Mb/s
  Duplex                 = full-duplex
  Advertise 1000 Mb/s FD = yes
  Advertise 1000 Mb/s HD = no
  Advertise  100 Mb/s FD = yes
  Advertise  100 Mb/s HD = yes
  Advertise   10 Mb/s FD = yes
  Advertise   10 Mb/s HD = yes
  Partner   1000 Mb/s FD = yes
  Partner   1000 Mb/s HD = no
  Partner    100 Mb/s FD = yes
  Partner    100 Mb/s HD = yes
  Partner     10 Mb/s FD = yes
  Partner     10 Mb/s HD = yes
NIC transmit+receive loopback test
intel selftest: pciaddr=0000:01:00.0 secs=1 receive=true loopback=true
Waiting for linkup............ ok
Generating traffic for 1 second(s)...
Statistics for PCI device 0000:01:00.0:
             727,161 MPC        Missed Packets Count
             799,960 PRC64      Packets Received [64 Bytes] Count
             799,969 GPRC       Good Packets Received Count
           1,527,136 GPTC       Good Packets Transmitted Count
          51,198,400 GORCL      Good Octets Received Count
          97,736,896 GOTCL      Good Octets Transmitted Count
              22,838 RNBC       Receive No Buffers Count
          97,737,920 TORL       Total Octets Received (Low)
          97,738,176 TOTL       Total Octets Transmitted (Low)
           1,527,160 TPR        Total Packets Received
           1,527,162 TPT        Total Packets Transmitted
           1,527,164 PTC64      Packets Transmitted [64 Bytes] Count
>>> DEBUG: After test, leaving selftest
MAC status
  STATUS      = 00080383
  Full Duplex = yes
  Link Up     = yes
  PHYRA       = no
  Speed       = 1000 Mb/s
Transmit status
  TCTL        = 3103f0fa
  TXDCTL      = 01410000
  TX Enable   = yes
  TDH         = 20741
  TDT         = 19671
  TDBAH       = 00000000
  TDBAL       = 10b80000
  TDLEN       = 524288
  TARC        = 00000403
  TIPG        = 00602006
Receive status
  RCTL        = 0603807a
  RXDCTL      = 01010000
  RX Enable   = yes
  RX Loopback = yes
  RDH         = 14471
  RDT         = 12709
  RDBAH       = 00000000
  RDBAL       = 10b00000
  RDLEN       = 524288
  RADV        = 10
PHY status
  Autonegotiate state    = complete
  Remote fault detection = no remote fault detected
  Copper Link Status     = copper link is up
  Speed                  = 1000Mb/s
  Duplex                 = full-duplex
  Advertise 1000 Mb/s FD = yes
  Advertise 1000 Mb/s HD = no
  Advertise  100 Mb/s FD = yes
  Advertise  100 Mb/s HD = yes
  Advertise   10 Mb/s FD = yes
  Advertise   10 Mb/s HD = yes
  Partner   1000 Mb/s FD = yes
  Partner   1000 Mb/s HD = no
  Partner    100 Mb/s FD = yes
  Partner    100 Mb/s HD = yes
  Partner     10 Mb/s FD = yes
  Partner     10 Mb/s HD = yes
pkazmier commented 11 years ago

Figured out why I was getting that stack trace before ... it occurred when I had the other side of my NIC connected to my wifi hub. Now that I've connected the other side of the NIC to the other port on the 82571, I no longer get the crash.

Update: the stack trace was resolved with https://github.com/SnabbCo/snabbswitch/commit/de334f35d36d17365460b8d8cd9c667f04e7638d

pkazmier commented 11 years ago

Hi Luke,

I'm looking for your guidance in terms of how you'd like me to integrate the 82571 code. If you look at the previous 6 commits on the http://github.com/pkazmier/snabbswitch/commits/iss34-intel-82571 branch, you'll find the changes I made to get the 82571 working. In summary, the changes consisted of the following:

Based on the above, do you have any thoughts on how you'd like me to move forward? Did you want to create a subclass that overrides the specific methods based on PCI device ID? Did you want low tech and a bunch of if statements? Did you want a different module for the specific cards (probably not based on your prior comments)?

Any suggestions would be appreciated (no rush as work will be grueling this week).

Thanks, Pete

ps. Here is the output from the a latest run:

root@world:/home/kaz# ./snabbswitch 
selftest: memory
Kernel HugeTLB pages (/proc/sys/vm/nr_hugepages): 12
  Allocating a 2MB HugeTLB: Got 2MB at 0x66800000
  Allocating a 2MB HugeTLB: Got 2MB at 0x63200000
  Allocating a 2MB HugeTLB: Got 2MB at 0x69e00000
  Allocating a 2MB HugeTLB: Got 2MB at 0x6d800000
Kernel HugeTLB pages (/proc/sys/vm/nr_hugepages): 12
HugeTLB page allocation OK.
selftest: pci
Scanning PCI devices:
pciaddr         vendor  device  iface   status
0000:00:00.0    0x8086  0x2e30  -       -
0000:00:01.0    0x8086  0x2e31  -       -
0000:00:02.0    0x8086  0x2e32  -       -
0000:00:1b.0    0x8086  0x27d8  -       -
0000:00:1c.0    0x8086  0x27d0  -       -
0000:00:1c.1    0x8086  0x27d2  -       -
0000:00:1d.0    0x8086  0x27c8  -       -
0000:00:1d.1    0x8086  0x27c9  -       -
0000:00:1d.2    0x8086  0x27ca  -       -
0000:00:1d.3    0x8086  0x27cb  -       -
0000:00:1d.7    0x8086  0x27cc  -       -
0000:00:1e.0    0x8086  0x244e  -       -
0000:00:1f.0    0x8086  0x27b8  -       -
0000:00:1f.2    0x8086  0x27c0  -       -
0000:00:1f.3    0x8086  0x27da  -       -
0000:01:00.0    0x8086  0x105e  -       -
0000:01:00.1    0x8086  0x105e  -       -
0000:02:00.0    0x10ec  0x8168  eth0    up
0000:03:00.0    0x10ec  0x8168  eth1    up
Suitable devices: 
  0000:01:00.0
  0000:01:00.1
selftest: intel device 0000:01:00.0
NIC transmit test
intel selftest: pciaddr=0000:01:00.0 secs=1
Waiting for linkup............ ok
Generating traffic for 1 second(s)...
Statistics for PCI device 0000:01:00.0:
           1,530,090 GPTC       Good Packets Transmitted Count
          97,926,080 GOTCL      Good Octets Transmitted Count
          97,927,168 TOTL       Total Octets Transmitted (Low)
           1,530,115 TPT        Total Packets Transmitted
           1,530,117 PTC64      Packets Transmitted [64 Bytes] Count
NIC transmit+receive loopback test
intel selftest: pciaddr=0000:01:00.0 secs=1 receive=true loopback=true
Waiting for linkup............ ok
Generating traffic for 1 second(s)...
Statistics for PCI device 0000:01:00.0:
             726,052 MPC        Missed Packets Count
             800,553 PRC64      Packets Received [64 Bytes] Count
             800,562 GPRC       Good Packets Received Count
           1,526,619 GPTC       Good Packets Transmitted Count
          51,236,416 GORCL      Good Octets Received Count
          97,703,936 GOTCL      Good Octets Transmitted Count
              22,840 RNBC       Receive No Buffers Count
          97,704,960 TORL       Total Octets Received (Low)
          97,705,216 TOTL       Total Octets Transmitted (Low)
           1,526,646 TPR        Total Packets Received
           1,526,648 TPT        Total Packets Transmitted
           1,526,649 PTC64      Packets Transmitted [64 Bytes] Count
selftest: intel device 0000:01:00.1
NIC transmit test
intel selftest: pciaddr=0000:01:00.1 secs=1
Waiting for linkup............ ok
Generating traffic for 1 second(s)...
Statistics for PCI device 0000:01:00.1:
           1,530,169 GPTC       Good Packets Transmitted Count
          97,931,136 GOTCL      Good Octets Transmitted Count
          97,932,352 TOTL       Total Octets Transmitted (Low)
           1,530,196 TPT        Total Packets Transmitted
           1,530,198 PTC64      Packets Transmitted [64 Bytes] Count
NIC transmit+receive loopback test
intel selftest: pciaddr=0000:01:00.1 secs=1 receive=true loopback=true
Waiting for linkup.......... ok
Generating traffic for 1 second(s)...
Statistics for PCI device 0000:01:00.1:
             725,515 MPC        Missed Packets Count
             800,522 PRC64      Packets Received [64 Bytes] Count
             800,531 GPRC       Good Packets Received Count
           1,526,051 GPTC       Good Packets Transmitted Count
          51,234,368 GORCL      Good Octets Received Count
          97,667,584 GOTCL      Good Octets Transmitted Count
              22,840 RNBC       Receive No Buffers Count
          97,668,608 TORL       Total Octets Received (Low)
          97,668,800 TOTL       Total Octets Transmitted (Low)
           1,526,076 TPR        Total Packets Received
           1,526,078 TPT        Total Packets Transmitted
           1,526,080 PTC64      Packets Transmitted [64 Bytes] Count
lukego commented 11 years ago

Hey awesome work! That code looks great. Cool also that Github makes it so easy to browse branches and their changes!

The best for me would be initially a single commit that adds support using low-tech if statements and then we can have a separate ongoing process of trying to find the best way to factor the code. I think there is a lot of room to experiment there e.g. PHY register array could be an object, each register could have its own C type laying out its bits as an anonymous struct, etc. I have not been fully satisfied with any of my experiments in this direction yet but I'm sure we'll find some really nice solutions over time :). (Think what a luxury of high-level tools we have compared with e.g. Thompson and Ritchie!)