luigirizzo / netmap

Automatically exported from code.google.com/p/netmap
BSD 2-Clause "Simplified" License
1.86k stars 537 forks source link

How to implement a read-only aggregated netmap port #474

Open leleobhz opened 6 years ago

leleobhz commented 6 years ago

Hello!

Im trying to create a vale switch with 2 ports and do read-only on this ports (They must not forward).

My idea is do the following:

vale-ctl -a vale0:p2p1
vale-ctl -a vale0:p2p2
tshark -i netmap:vale0 -c1 # or tshark -i netmap:vale0/rt -c1 when already open

But using jumboframes or not, I receive the following issue:

 root ▶ sa-eqnx2-paris ▶ SSH ▶ ~ ▶ # ▶ vale-ctl -a vale0:p2p1
vale0:p2p1: Invalid argument
 root ▶ sa-eqnx2-paris ▶ SSH ▶ ~ ▶ # ▶ dmesg | tail -n 8
[ 2110.003372] 303.467798 [ 769] netmap_update_config      configuration changed for vale0:p2p1: txring 8 x 512, rxring 8 x 512, rxbufsz 4096
[ 2110.034019] error: large MTU (1500) needed but p2p1 does not support NS_MOREFRAG
[ 2121.736382] 315.200820 [ 884] netmap_get_bdg_na         no bridges available for 'vale0:p2p1'
[ 2121.757298] 315.221736 [ 769] netmap_update_config      configuration changed for vale0:p2p1: txring 8 x 512, rxring 8 x 512, rxbufsz 4096
[ 2121.788724] error: large MTU (1500) needed but p2p1 does not support NS_MOREFRAG
[ 2136.898881] 330.363333 [ 884] netmap_get_bdg_na         no bridges available for 'vale0:p2p1'
[ 2136.920349] 330.384801 [ 769] netmap_update_config      configuration changed for vale0:p2p1: txring 8 x 512, rxring 8 x 512, rxbufsz 4096
[ 2136.952692] error: large MTU (1500) needed but p2p1 does not support NS_MOREFRAG

Or with jumbo:

...
[ 1859.366182] 052.830366 [ 884] netmap_get_bdg_na         no bridges available for 'vale0:p2p1'
[ 1859.387060] 052.851245 [ 769] netmap_update_config      configuration changed for vale0:p2p1: txring 8 x 512, rxring 8 x 512, rxbufsz 4096
[ 1859.421323] error: large MTU (9710) needed but p2p1 does not support NS_MOREFRAG
...

What Im doing wrong? May be a issue? How can i listen (and listen only, with no forward) more than one interface in a vale switch?

Thanks!

vmaffione commented 6 years ago

Hi, The no bridges available is misleading, so please ignore it (I just removed that log). The error you see is related to the use jumbo frames. Which driver is p2p1 using? Is it using native mode or emulated?

vmaffione commented 6 years ago

Moreover, your output does not really make sense... Can you apply the following patch for debugging purpose, and run again the same commands?

diff --git a/sys/dev/netmap/netmap.c b/sys/dev/netmap/netmap.c
index 130f9014..dc047aeb 100644
--- a/sys/dev/netmap/netmap.c
+++ b/sys/dev/netmap/netmap.c
@@ -2118,7 +2118,7 @@ netmap_do_regif(struct netmap_priv_d *priv, struct netmap_adapter *na,
                        unsigned nbs = netmap_mem_bufsize(na->nm_mem);
                        unsigned mtu = nm_os_ifnet_mtu(na->ifp);

-                       ND("mtu %d rx_buf_maxsize %d netmap_buf_size %d",
+                       D("mtu %d rx_buf_maxsize %d netmap_buf_size %d",
                                        mtu, na->rx_buf_maxsize, nbs);

                        if (mtu <= na->rx_buf_maxsize) {
leleobhz commented 6 years ago

Hello!

I'm using ixgbe from master, using the following build script:

#!/bin/bash

# yum install -y kernel-ml kernel-ml-headers kernel-ml-devel kernel-ml-tools-libs-devel kernel-ml-tools

# Last Kernel parsed from kernel-headers/devel
#lastKernel=$(find /usr/src/kernels -mindepth 1 -maxdepth 1 -type d -printf '%p\n' | sort -t\/ -Vk3 | tail -n1 | cut -f2- -d" " | xargs basename)

# Last Kernel parsed from /boot binary images
lastKernel=$(ls /boot/vmlinuz-*[0-9].* | sort -V | tail -n1 | sed -e 's,/boot/vmlinuz-,,g')

# Source/GIT related code
if [[ ! -d ~/netmap_git/netmap ]]; then
        echo "Creating new folder"
        mkdir -p ~/netmap_git
        git clone https://github.com/luigirizzo/netmap.git ~/netmap_git/netmap
else
        echo "Updating existent netmap"
        pushd ~/netmap_git/netmap
        git reset --hard
        git clean -f -x
        git pull
        popd
fi

# Compilation code
pushd ~/netmap_git/netmap
        LANG=C ./configure --drivers=ixgbe --select-version=ixgbe:5.3.6 --prefix=/usr/local/ --enable-vale --enable-pipe --enable-monitor --enable-ptnetmap --enable-sink --no-force-debug --kernel-version="${lastKernel}"
        if [[ $? != 0 ]]; then
                less config.log
                popd
                exit 1
        fi
        V=1 make -j1
        sudo make install
        depmod -a ${lastKernel}
popd

pushd ~/netmap_git/netmap/apps/vale-ctl/
        make
        sudo make install
popd

I'll try to apply the patch and return with the result.

Thanks!

leleobhz commented 6 years ago

Also, the Jumbo Frame configuration is:

 root ▶ sa-eqnx2-paris ▶ SSH ▶ ~ ▶ # ▶ grep mtu /etc/rc.local 
ip link set mtu 9710 dev p2p1 | tee -a /var/log/ixgbe_last.log
ip link set mtu 9710 dev p2p2 | tee -a /var/log/ixgbe_last.log

Does vale switch have a MTU configuration or this does not make sense in netmap context (I think the second awnser is the right anwser because the way I think netmap handles their ring).

vmaffione commented 6 years ago

The MTU for regular VALE ports and ixgbe native ports is simply the size of netmap buffers. You should also try to set mtu to default for ixgbe ports (1500)

leleobhz commented 6 years ago

Hello @vmaffione !

In fact im using the following netmap buffer size:

#Jumbo!!!! https://github.com/luigirizzo/netmap/issues/466
netmap_buf_size=9710
...
# Waiting https://github.com/luigirizzo/netmap/issues/466
ip link set mtu 9710 dev p2p1 | tee -a /var/log/ixgbe_last.log
ip link set mtu 9710 dev p2p2 | tee -a /var/log/ixgbe_last.log
...
if [[ ! -z ${netmap_buf_num} ]]; then
        echo -n "Netmap buf_num: " | tee -a /var/log/ixgbe_last.log
        echo ${netmap_buf_num} | tee -a /sys/module/netmap/parameters/buf_num /var/log/ixgbe_last.log
fi

With you patch and last git, I got the following:

 root ▶ sa-eqnx2-paris ▶ SSH ▶ ~ ▶ netmap_git ❯ netmap ▶ master ▶ ✎ ▶ 5❓ ▶ # ▶ vale-ctl -a vale0:p2p1              
 root ▶ sa-eqnx2-paris ▶ SSH ▶ ~ ▶ netmap_git ❯ netmap ▶ master ▶ ✎ ▶ 5❓ ▶ # ▶ vale-ctl -a vale0:p2p2              
vale0:p2p2: Invalid argument 

And in dmesg:

Load modules:

[Abr11 14:24] ixgbe 0000:04:00.0: removed PHC on p2p1     
[  +0,044005] ixgbe 0000:04:00.0 p2p1: detected SFP+: 3   
[  +0,070961] ixgbe 0000:04:00.1: removed PHC on p2p2     
[  +0,553728] netmap: unloaded module.                    
[  +0,021272] 473.783421 [3883] netmap_init               run mknod /dev/netmap c 10 57 # returned 0                
[  +0,013726] netmap: loaded module                       
[  +0,007665] net nmsink0: netmap queues/slots: TX 1/1024, RX 1/1024                                                
[  +0,005352] IPv6: ADDRCONF(NETDEV_UP): nmsink0: link is not ready                                                 
[  +0,023834] Intel(R) 10GbE PCI Express Linux Network Driver - version 5.3.6                                       
[  +0,011908] Copyright(c) 1999 - 2018 Intel Corporation. 
[  +0,030899] ixgbe: Interrupt Mode set to 2              
[  +0,008800] ixgbe: Direct Cache Access (DCA) set to 1   
[  +0,009950] ixgbe: Receive-Side Scaling (RSS) set to 8  
[  +0,010092] ixgbe: Virtual Machine Device Queues (VMDQ) set to 0                                                  
[  +0,011176] ixgbe: I/O Virtualization (IOV) set to 0    
[  +0,010065] ixgbe: Interrupt Throttling Rate (ints/sec) set to 64000                                              
[  +0,011744] ixgbe: Enabled/Disable FCoE offload Disabled                                                          
[  +0,010627] ixgbe: 0000:04:00.0: ixgbe_check_options: FCoE Offload feature disabled                               
[  +0,013453] ixgbe: LRO - Large Receive Offload Enabled  
[  +0,010725] ixgbe: allow_unsupported_sfp Disabled       
[  +0,153900] ixgbe 0000:04:00.0: PCI Express bandwidth of 32GT/s available                                         
[  +0,012771] ixgbe 0000:04:00.0: (Speed:5.0GT/s, Width: x8, Encoding Loss:20%)                                     
[  +0,013602] ixgbe 0000:04:00.0 eth0: MAC: 2, PHY: 14, SFP+: 3, PBA No: G73131-003                                 
[  +0,013775] ixgbe 0000:04:00.0: a0:36:9f:61:d0:ac       
[  +0,010683] ixgbe 0000:04:00.0 eth0: Enabled Features: RxQ: 8 TxQ: 8 FdirHash DCA                                 
[  +0,016215] ixgbe 0000:04:00.0 eth0: Intel(R) 10 Gigabit Network Connection                                       
[  +0,013541] net eth0: netmap queues/slots: TX 8/512, RX 8/512                                                     
[  +0,032743] ixgbe: Interrupt Mode set to 2              
[  +0,010386] ixgbe: Direct Cache Access (DCA) set to 1   
[  +0,011525] ixgbe: Receive-Side Scaling (RSS) set to 8  
[  +0,011682] ixgbe: Virtual Machine Device Queues (VMDQ) set to 0                                                  
[  +0,012774] ixgbe: I/O Virtualization (IOV) set to 0    
[  +0,011642] ixgbe: Interrupt Throttling Rate (ints/sec) set to 64000                                              
[  +0,013323] ixgbe: Enabled/Disable FCoE offload Disabled                                                          
[  +0,012181] ixgbe: 0000:04:00.1: ixgbe_check_options: FCoE Offload feature disabled                               
[  +0,015010] ixgbe: LRO - Large Receive Offload Enabled  
[  +0,012238] ixgbe: allow_unsupported_sfp Disabled       
[  +0,029756] ixgbe 0000:04:00.0 p2p1: renamed from eth0  
[  +0,020662] IPv6: ADDRCONF(NETDEV_UP): p2p1: link is not ready                                                    
[  +0,049801] ixgbe 0000:04:00.0: registered PHC device on p2p1                                                     
[  +0,121098] IPv6: ADDRCONF(NETDEV_UP): p2p1: link is not ready                                                    
[  +0,013458] 8021q: adding VLAN 0 to HW filter on device p2p1                                                      
[  +0,047407] ixgbe 0000:04:00.0 p2p1: detected SFP+: 3   
[  +0,249743] ixgbe 0000:04:00.0 p2p1: NIC Link is Up 10 Gbps, Flow Control: RX/TX                                  
[  +0,015673] IPv6: ADDRCONF(NETDEV_CHANGE): p2p1: link becomes ready                                               
[  +0,614353] ixgbe 0000:04:00.1: PCI Express bandwidth of 32GT/s available                                         
[  +0,015164] ixgbe 0000:04:00.1: (Speed:5.0GT/s, Width: x8, Encoding Loss:20%)                                     
[  +0,015971] ixgbe 0000:04:00.1 eth0: MAC: 2, PHY: 1, PBA No: G73131-003                                           
[  +0,014932] ixgbe 0000:04:00.1: a0:36:9f:61:d0:ae       
[  +0,012759] ixgbe 0000:04:00.1 eth0: Enabled Features: RxQ: 8 TxQ: 8 FdirHash DCA                                 
[  +0,018259] ixgbe 0000:04:00.1 eth0: Intel(R) 10 Gigabit Network Connection                                       
[  +0,015463] net eth0: netmap queues/slots: TX 8/512, RX 8/512                                                     
[  +0,015578] ixgbe 0000:04:00.0 p2p1: changing MTU from 1500 to 9710                                               
[  +0,231538] ixgbe 0000:04:00.1 p2p2: renamed from eth0  
[  +0,021408] IPv6: ADDRCONF(NETDEV_UP): p2p2: link is not ready                                                    
[  +0,047942] ixgbe 0000:04:00.0 p2p1: detected SFP+: 3   
[  +0,014435] ixgbe 0000:04:00.1: registered PHC device on p2p2                                                     
[  +0,122532] ixgbe 0000:04:00.0 p2p1: NIC Link is Up 10 Gbps, Flow Control: RX/TX                                  
[  +0,001510] IPv6: ADDRCONF(NETDEV_UP): p2p2: link is not ready                                                    
[  +0,028110] 8021q: adding VLAN 0 to HW filter on device p2p2                                                      
[  +1,799626] device p2p1 entered promiscuous mode        
[  +0,296200] ixgbe 0000:04:00.0 p2p1: detected SFP+: 3   
[  +0,286999] ixgbe 0000:04:00.0 p2p1: detected SFP+: 3   
[  +0,232283] device p2p2 entered promiscuous mode        
[  +0,041257] ixgbe 0000:04:00.0 p2p1: detected SFP+: 3   
[  +0,249989] ixgbe 0000:04:00.0 p2p1: NIC Link is Up 10 Gbps, Flow Control: None                                   
[  +2,441151] 481.436496 [1324] netmap_config_obj_allocator XXX aligning object by 18 bytes                         
[  +0,275759] 481.712253 [ 769] netmap_update_config      configuration changed for eth0: txring 8 x 512, rxring 8 x 512, rxbufsz 1500
[  +0,028644] 481.740899 [2122] netmap_do_regif           mtu 9710 rx_buf_maxsize 9728 netmap_buf_size 9728         
[  +0,288665] ixgbe 0000:04:00.0 p2p1: detected SFP+: 3   
[  +0,215817] 482.245382 [2122] netmap_do_regif           mtu 9710 rx_buf_maxsize 9728 netmap_buf_size 9728         
[  +0,059239] ixgbe 0000:04:00.0 p2p1: detected SFP+: 3   
[  +0,435857] ixgbe 0000:04:00.0 p2p1: detected SFP+: 3   
[  +0,190896] ixgbe 0000:04:00.0 p2p1: NIC Link is Up 10 Gbps, Flow Control: None                                   
[  +1,874995] ixgbe 0000:04:00.0 p2p1: detected SFP+: 3   
[  +1,357073] ixgbe 0000:04:00.0 p2p1: detected SFP+: 3   
[  +0,247929] ixgbe 0000:04:00.0 p2p1: NIC Link is Up 10 Gbps, Flow Control: None                                   

After vale-ctl command:

[Abr11 14:25] ixgbe 0000:04:00.0 p2p1: detected SFP+: 3   
[  +0,146904] ixgbe 0000:04:00.0 p2p1: NIC Link is Up 10 Gbps, Flow Control: None                                   
[  +7,735367] 541.218735 [ 769] netmap_update_config      configuration changed for vale0:p2p1: txring 8 x 512, rxring 8 x 512, rxbufsz 4096
[  +0,030073] 541.248808 [2122] netmap_do_regif           mtu 9710 rx_buf_maxsize 9728 netmap_buf_size 9728         
[  +0,295791] ixgbe 0000:04:00.0 p2p1: detected SFP+: 3   
[  +0,146754] ixgbe 0000:04:00.0 p2p1: NIC Link is Up 10 Gbps, Flow Control: None                                   
[ +11,845731] 553.537101 [ 769] netmap_update_config      configuration changed for eth0: txring 8 x 512, rxring 8 x 512, rxbufsz 1500
[  +0,030662] 553.567764 [ 769] netmap_update_config      configuration changed for vale0:p2p2: txring 8 x 512, rxring 8 x 512, rxbufsz 4096
[  +0,031859] 553.599623 [2122] netmap_do_regif           mtu 1500 rx_buf_maxsize 0 netmap_buf_size 9728            
[  +0,019708] error: large MTU (1500) needed but p2p2 does not support NS_MOREFRAG                                  

In fact, p2p1 does have a 10gbe twinax and p2p2 wasn't connected yet. May I need in fact a linked interface?

Also, if I try to open vale0 interface, app fails and I receive the follow dmesg issue:

[Abr11 14:29] 792.056269 [ 363] nm_find_bridge            invalid bridge name vale0                                                                  

Its missing something? I'm a bit afraid of vale switch because I just need 2 interfaces to be readable by same interface ring, to allow me listen 20gbps in a same virtual interface. Can you please help me with this?

Thanks!!!

vmaffione commented 6 years ago

There is a bug here: rx_buf_maxsize 0, while it should never be zero. You have multiple interfaces, so I don't really understand what is going on. Can you try the latest master with this additional patch:

diff --git a/sys/dev/netmap/netmap.c b/sys/dev/netmap/netmap.c
index fa7ad0d2..5fcfd70b 100644
--- a/sys/dev/netmap/netmap.c
+++ b/sys/dev/netmap/netmap.c
@@ -2118,7 +2118,7 @@ netmap_do_regif(struct netmap_priv_d *priv, struct netmap_adapter *na,
                        unsigned nbs = netmap_mem_bufsize(na->nm_mem);
                        unsigned mtu = nm_os_ifnet_mtu(na->ifp);

-                       ND("%s: mtu %d rx_buf_maxsize %d netmap_buf_size %d",
+                       D("%s: mtu %d rx_buf_maxsize %d netmap_buf_size %d",
                                        na->name, mtu, na->rx_buf_maxsize, nbs);

                        if (mtu <= na->rx_buf_maxsize) {

?

What you want to do is a sort of aggregated unidirectional port. You could do that with VALE, but the default learning switch algorithm won't work for non-broadcast traffic (or would be inefficient). You would need to program a VALE switch by replacing the forwarding function with a function that just statically forwards anything arriving on ports B and C towards port A. That is feasible, but you need to learn how to write a kernel module to program VALE.

An easier approach would be to reuse the fe example program from the netmap tutorial https://github.com/netmap-unipi/netmap-tutorial You can simplify the program with the patch attached fe-patch.txt Then by running this

   $ sudo ./fe -i netmap:lo{1 -i netmap:p2p1 -i netmap:p2p2

traffic from p2p1 and p2p2 will be aggregated to the lo{1 pipe. You can then read the traffic from the other end of the pipe lo}1

   $ tshark -i netmap:lo}1 -c1

The fe program is just an example, and does a copy when forwarding. You could extend it with zerocopy and save CPU cycles.

leleobhz commented 6 years ago

Hello @vmaffione

About the bug, compiling with

        cat <<EOF > issue474.diff
diff --git a/sys/dev/netmap/netmap.c b/sys/dev/netmap/netmap.c
index fa7ad0d..5fcfd70 100644
--- a/sys/dev/netmap/netmap.c
+++ b/sys/dev/netmap/netmap.c
@@ -2118,7 +2118,7 @@ netmap_do_regif(struct netmap_priv_d *priv, struct netmap_adapter *na,
                        unsigned nbs = netmap_mem_bufsize(na->nm_mem);
                        unsigned mtu = nm_os_ifnet_mtu(na->ifp);

-                       ND("%s: mtu %d rx_buf_maxsize %d netmap_buf_size %d",
+                       D("%s: mtu %d rx_buf_maxsize %d netmap_buf_size %d",
                                        na->name, mtu, na->rx_buf_maxsize, nbs);

                        if (mtu <= na->rx_buf_maxsize) {
EOF

git apply issue474.diff || exit 1

A netmap start, capture, stop capture, insert in vale switch and remove of vale switch results in the following dmesg:

[173455.082298] netmap: unloaded module.                  
[173455.110188] 648.750742 [3883] netmap_init               run mknod /dev/netmap c 10 57 # returned 0              
[173455.129619] netmap: loaded module                     
[173455.142904] net nmsink0: netmap queues/slots: TX 1/1024, RX 1/1024                                              
[173455.147981] IPv6: ADDRCONF(NETDEV_UP): nmsink0: link is not ready                                               
[173455.181542] Intel(R) 10GbE PCI Express Linux Network Driver - version 5.3.6                                     
[173455.198963] Copyright(c) 1999 - 2018 Intel Corporation.                                                         
[173455.235165] ixgbe: Interrupt Mode set to 2            
[173455.249158] ixgbe: Direct Cache Access (DCA) set to 1 
[173455.264079] ixgbe: Receive-Side Scaling (RSS) set to 8                                                          
[173455.278952] ixgbe: Virtual Machine Device Queues (VMDQ) set to 0                                                
[173455.294704] ixgbe: I/O Virtualization (IOV) set to 0  
[173455.309090] ixgbe: Interrupt Throttling Rate (ints/sec) set to 64000                                            
[173455.324948] ixgbe: Enabled/Disable FCoE offload Disabled                                                        
[173455.339570] ixgbe: 0000:04:00.0: ixgbe_check_options: FCoE Offload feature disabled                             
[173455.357059] ixgbe: LRO - Large Receive Offload Enabled                                                          
[173455.371669] ixgbe: allow_unsupported_sfp Disabled     
[173455.532722] ixgbe 0000:04:00.0: PCI Express bandwidth of 32GT/s available                                       
[173455.549469] ixgbe 0000:04:00.0: (Speed:5.0GT/s, Width: x8, Encoding Loss:20%)                                   
[173455.566911] ixgbe 0000:04:00.0 eth0: MAC: 2, PHY: 14, SFP+: 3, PBA No: G73131-003                               
[173455.584521] ixgbe 0000:04:00.0: a0:36:9f:61:d0:ac     
[173455.598939] ixgbe 0000:04:00.0 eth0: Enabled Features: RxQ: 8 TxQ: 8 FdirHash DCA                               
[173455.618621] ixgbe 0000:04:00.0 eth0: Intel(R) 10 Gigabit Network Connection                                     
[173455.635338] net eth0: netmap queues/slots: TX 8/512, RX 8/512                                                   
[173455.670995] ixgbe: Interrupt Mode set to 2            
[173455.684072] ixgbe: Direct Cache Access (DCA) set to 1 
[173455.698222] ixgbe: Receive-Side Scaling (RSS) set to 8                                                          
[173455.712295] ixgbe: Virtual Machine Device Queues (VMDQ) set to 0                                                
[173455.727240] ixgbe: I/O Virtualization (IOV) set to 0  
[173455.740867] ixgbe: Interrupt Throttling Rate (ints/sec) set to 64000                                            
[173455.755896] ixgbe: Enabled/Disable FCoE offload Disabled                                                        
[173455.769598] ixgbe: 0000:04:00.1: ixgbe_check_options: FCoE Offload feature disabled                             
[173455.786300] ixgbe: LRO - Large Receive Offload Enabled                                                          
[173455.801362] ixgbe: allow_unsupported_sfp Disabled     
[173455.803678] ixgbe 0000:04:00.0 p2p1: renamed from eth0                                                          
[173455.834289] IPv6: ADDRCONF(NETDEV_UP): p2p1: link is not ready                                                  
[173455.882680] ixgbe 0000:04:00.0: registered PHC device on p2p1                                                   
[173456.004283] IPv6: ADDRCONF(NETDEV_UP): p2p1: link is not ready                                                  
[173456.017984] 8021q: adding VLAN 0 to HW filter on device p2p1                                                    
[173456.064874] ixgbe 0000:04:00.0 p2p1: detected SFP+: 3 
[173456.210797] ixgbe 0000:04:00.0 p2p1: NIC Link is Up 10 Gbps, Flow Control: RX/TX                                
[173456.226455] IPv6: ADDRCONF(NETDEV_CHANGE): p2p1: link becomes ready                                             
[173456.944836] ixgbe 0000:04:00.1: PCI Express bandwidth of 32GT/s available                                       
[173456.959904] ixgbe 0000:04:00.1: (Speed:5.0GT/s, Width: x8, Encoding Loss:20%)                                   
[173456.975724] ixgbe 0000:04:00.1 eth0: MAC: 2, PHY: 1, PBA No: G73131-003                                         
[173456.990729] ixgbe 0000:04:00.1: a0:36:9f:61:d0:ae     
[173457.003513] ixgbe 0000:04:00.1 eth0: Enabled Features: RxQ: 8 TxQ: 8 FdirHash DCA                               
[173457.021821] ixgbe 0000:04:00.1 eth0: Intel(R) 10 Gigabit Network Connection                                     
[173457.037245] net eth0: netmap queues/slots: TX 8/512, RX 8/512                                                   
[173457.052926] ixgbe 0000:04:00.0 p2p1: changing MTU from 1500 to 9710                                             
[173457.284556] ixgbe 0000:04:00.1 p2p2: renamed from eth0                                                          
[173457.305102] IPv6: ADDRCONF(NETDEV_UP): p2p2: link is not ready                                                  
[173457.352798] ixgbe 0000:04:00.0 p2p1: detected SFP+: 3 
[173457.367322] ixgbe 0000:04:00.1: registered PHC device on p2p2                                                   
[173457.484250] IPv6: ADDRCONF(NETDEV_UP): p2p2: link is not ready                                                  
[173457.498203] 8021q: adding VLAN 0 to HW filter on device p2p2                                                    
[173457.690887] ixgbe 0000:04:00.0 p2p1: NIC Link is Up 10 Gbps, Flow Control: RX/TX                                
[173459.322326] device p2p1 entered promiscuous mode      
[173459.616249] ixgbe 0000:04:00.0 p2p1: detected SFP+: 3 
[173459.907043] ixgbe 0000:04:00.0 p2p1: detected SFP+: 3 
[173460.143888] device p2p2 entered promiscuous mode      
[173460.184420] ixgbe 0000:04:00.0 p2p1: detected SFP+: 3 
[173460.330780] ixgbe 0000:04:00.0 p2p1: NIC Link is Up 10 Gbps, Flow Control: None                                 
[173462.869061] 656.509624 [1324] netmap_config_obj_allocator XXX aligning object by 18 bytes                       
[173463.146777] 656.787337 [ 769] netmap_update_config      configuration changed for eth0: txring 8 x 512, rxring 8 x 512, rxbufsz 1500
[173463.175324] 656.815885 [2122] netmap_do_regif           eth0: mtu 9710 rx_buf_maxsize 9728 netmap_buf_size 9728 
[173463.472050] ixgbe 0000:04:00.0 p2p1: detected SFP+: 3 
[173463.685270] 657.325832 [2122] netmap_do_regif           eth0: mtu 9710 rx_buf_maxsize 9728 netmap_buf_size 9728 
[173463.748536] ixgbe 0000:04:00.0 p2p1: detected SFP+: 3 
[173464.167124] ixgbe 0000:04:00.0 p2p1: detected SFP+: 3 
[173464.274792] ixgbe 0000:04:00.0 p2p1: NIC Link is Up 10 Gbps, Flow Control: None                                 
[173466.248180] ixgbe 0000:04:00.0 p2p1: detected SFP+: 3 
[173467.607826] ixgbe 0000:04:00.0 p2p1: detected SFP+: 3 
[173467.850811] ixgbe 0000:04:00.0 p2p1: NIC Link is Up 10 Gbps, Flow Control: None                                 
[173479.688662] ixgbe 0000:04:00.0 p2p1: detected SFP+: 3 
[173479.834797] ixgbe 0000:04:00.0 p2p1: NIC Link is Up 10 Gbps, Flow Control: None                                 
[173481.960847] 675.601428 [2122] netmap_do_regif           vale0:p2p1: mtu 9710 rx_buf_maxsize 9728 netmap_buf_size 9728
[173482.264131] ixgbe 0000:04:00.0 p2p1: detected SFP+: 3 
[173482.546891] ixgbe 0000:04:00.0 p2p1: NIC Link is Up 10 Gbps, Flow Control: None                                 
[173483.256301] 676.896883 [ 769] netmap_update_config      configuration changed for eth0: txring 8 x 512, rxring 8 x 512, rxbufsz 1500
[173483.286449] 676.927032 [ 769] netmap_update_config      configuration changed for vale0:p2p2: txring 8 x 512, rxring 8 x 512, rxbufsz 1500
[173483.317962] 676.958545 [2122] netmap_do_regif           vale0:p2p2: mtu 1500 rx_buf_maxsize 0 netmap_buf_size 9728
[173483.338760] error: large MTU (1500) needed but p2p2 does not support NS_MOREFRAG                                
[173491.280307] ixgbe 0000:04:00.0 p2p1: detected SFP+: 3 
[173491.426774] ixgbe 0000:04:00.0 p2p1: NIC Link is Up 10 Gbps, Flow Control: None                                 

About the fe app, I'm not a skilled kernel dev neither app developer, but I'll try to find a way here. I'll test the attached patch and check what happens!

Thanks!

leleobhz commented 6 years ago

About the fe util:

I did the following changes: fe-no-distinct.diff.txt (A bit more agressive than you original patch, but you fe-patch.txt helped me a lot).

It worked well and as expected, CPU usage was a bit high and no multiple cores. As a non-skilled dev, I suspect the issue about copying is just the memcpy(), but I dont know how to copy it directly in the netmap ring. There is a function to do this?

Also, if I want to do a multithread application, where I need to start? About what I need to care about netmap and threading?

Thanks a lot!

vmaffione commented 6 years ago

The rx_buf_maxsize == 0 should be now fixed in the current master. Can you please retry? No patches needed.

Regarding the fe example, you can completely avoid any packet copies, by swapping the netmap slot of a NIC RX ring with a netmap slot with the TX pipe ring. You can find an example of netmap slot swapping in the solutions/swap.c example, looking for zerocopy option variable. For multithreading, you need a multi-queue NIC (like ixgbe). If you configure the NICs to have 4 RX queues, then you can use a separate thread to handle each RX queue (e.g. thread0 handles queue0 of p2p1 and queue0 of p2p2, ...); each thread can then forward to a different pipe. And you can run 4 tshark instances on the four pipes.

vmaffione commented 6 years ago

In general I would suggest looking at the netmap tutorial (intro.pdf) and do the codelab.

leleobhz commented 6 years ago

Hello @vmaffione

I'll test w/o patches again! Also, i'll read the documents and try the threading idea. I'm a really non-skilled developer, so I'm doing some babysteps here - so please sorry about all the questions.

I'll try to make the program here with the guides you awnsered in last iteraction.

Thank you so much!

jmtilli commented 6 years ago

About the multi-threading: the only option is not to use netmap directly. One possibility is to use a wrapper around netmap. If you do that, you gain support to alternative packet I/O mechanisms such as DPDK and sockets. The drawback is of course somewhat reduced performance. However, I have found such packet I/O wrappers to be easier to use than using netmap directly.

Probably the most popular packet I/O wrapper is OpenDataPlane (ODP) at https://github.com/Linaro/odp

I have created my own, optimized for use with netmap, and which has higher performance than ODP and it's called LDP: https://github.com/jmtilli/pptk/tree/master/ldp

At least LDP has a very simple multi-threaded example application called ldpfwdmt. Note that for LDP to work, you need to set the queue count to match thread count by using ethtool -L command.

vmaffione commented 6 years ago

With netmap you can handle each ring or each couple of TX/RX rings using a separate thread. Typically, each thread calls nm_open("eth0-X", ...) with X in [0..N-1], like pkt-gen does. That's all you need to write a multi-threaded program. I don't see how using "wrappers" is the only option.