luigirizzo / netmap

Automatically exported from code.google.com/p/netmap
BSD 2-Clause "Simplified" License
1.85k stars 536 forks source link

betmap-libpcap crashes with jumbo packets set in "/sys/modules/netmap_lin/parameters/buff_size" #25

Closed GoogleCodeExporter closed 7 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Set "buff_size" to 9000 bytes
echo 9000 >  /sys/module/netmap_lin/parameters/buf_size
2. Launch tcpdump using netmap-libpcap
sudo taskset -c 0  sudo tcpdump -s 0 -nei netmap:eth5 -B 2000000 -w 
/mnt/tmpfs/t.pcap

What is the expected output? What do you see instead?
tcpdump should allow packet capture with jumbo frames. Instead, tcpdump exits 
and the following errors are logged in dmesg

dmesg
[85444.661950] 549.216701 [1518] netmap_set_ringid         eth5: tx [0,12) rx 
[0,12) id 0
[85444.746468] 549.301141 [2466] netmap_reset              eth5 TX0 hwofs 0 -> 
0, hwtail 511 -> 511
[85444.751170] 549.305849 [2466] netmap_reset              eth5 TX1 hwofs 0 -> 
0, hwtail 511 -> 511
[85444.755895] 549.310561 [2466] netmap_reset              eth5 TX2 hwofs 0 -> 
0, hwtail 511 -> 511
[85444.759668] 549.314344 [2466] netmap_reset              eth5 TX3 hwofs 0 -> 
0, hwtail 511 -> 511
[85444.763666] 549.318339 [2466] netmap_reset              eth5 TX4 hwofs 0 -> 
0, hwtail 511 -> 511
[85444.767676] 549.322346 [2466] netmap_reset              eth5 TX5 hwofs 0 -> 
0, hwtail 511 -> 511
[85444.769695] 549.324363 [2466] netmap_reset              eth5 TX6 hwofs 0 -> 
0, hwtail 511 -> 511
[85444.771715] 549.326381 [2466] netmap_reset              eth5 TX7 hwofs 0 -> 
0, hwtail 511 -> 511
[85444.773735] 549.328399 [2466] netmap_reset              eth5 TX8 hwofs 0 -> 
0, hwtail 511 -> 511
[85444.775753] 549.330416 [2466] netmap_reset              eth5 TX9 hwofs 0 -> 
0, hwtail 511 -> 511
[85444.777773] 549.332434 [2466] netmap_reset              eth5 TX10 hwofs 0 -> 
0, hwtail 511 -> 511
[85444.779793] 549.334453 [2466] netmap_reset              eth5 TX11 hwofs 0 -> 
0, hwtail 511 -> 511
[85444.781840] 549.336499 [2466] netmap_reset              eth5 RX0 hwofs 0 -> 
0, hwtail 0 -> 0
[85444.783871] 549.338528 [2466] netmap_reset              eth5 RX1 hwofs 0 -> 
0, hwtail 0 -> 0
[85444.785904] 549.340560 [2466] netmap_reset              eth5 RX2 hwofs 0 -> 
0, hwtail 0 -> 0
[85444.787939] 549.342593 [2466] netmap_reset              eth5 RX3 hwofs 0 -> 
0, hwtail 0 -> 0
[85444.789972] 549.344624 [2466] netmap_reset              eth5 RX4 hwofs 0 -> 
0, hwtail 0 -> 0
[85444.792007] 549.346657 [2466] netmap_reset              eth5 RX5 hwofs 0 -> 
0, hwtail 0 -> 0
[85444.794041] 549.348690 [2466] netmap_reset              eth5 RX6 hwofs 0 -> 
0, hwtail 0 -> 0
[85444.796074] 549.350722 [2466] netmap_reset              eth5 RX7 hwofs 0 -> 
0, hwtail 0 -> 0
[85444.798107] 549.352753 [2466] netmap_reset              eth5 RX8 hwofs 0 -> 
0, hwtail 0 -> 0
[85444.800142] 549.354786 [2466] netmap_reset              eth5 RX9 hwofs 0 -> 
0, hwtail 0 -> 0
[85444.802175] 549.356817 [2466] netmap_reset              eth5 RX10 hwofs 0 -> 
0, hwtail 0 -> 0
[85444.804210] 549.358851 [2466] netmap_reset              eth5 RX11 hwofs 0 -> 
0, hwtail 0 -> 0
[85444.979932] ixgbe 0000:06:00.1 eth5: detected SFP+: 6
[85444.980177] 549.534683 [2516] netmap_common_irq         received TX queue 1
[85445.010700] 549.565183 [2516] netmap_common_irq         received RX queue 1
[85445.039196] 549.593657 [2516] netmap_common_irq         received TX queue 11
[85445.065636] 549.620077 [2516] netmap_common_irq         received RX queue 11
[85445.091867] 549.646287 [2516] netmap_common_irq         received TX queue 0
[85445.185822] tcpdump[3734]: segfault at 17 ip 00007fb9e224d384 sp 
00007fffe8f264e0 error 4 in libpcap.so.1.6.0-PRE-GIT[7fb9e223c000+3f000]
[85445.236048] ixgbe 0000:06:00.1 eth5: NIC Link is Up 10 Gbps, Flow Control: RX
[85445.314795] 549.869041 [ 710] netmap_do_unregif         deleting last 
instance for eth5
[85445.572375] 550.126420 [1362] netmap_mem_global_deref   refcount = 0
[85445.668471] ixgbe 0000:06:00.1 eth5: detected SFP+: 6
[85445.796483] ixgbe 0000:06:00.1 eth5: NIC Link is Up 10 Gbps, Flow Control: RX

What version of the product are you using? On what operating system?
netmap master
netmap-libpcap master
tcpdump 4.6.2
Ubuntu 14.04

Original issue reported on code.google.com by morph...@gmail.com on 21 Oct 2014 at 10:17

GoogleCodeExporter commented 9 years ago
Hi morphyno,
we fixed the problem with jumbo frames and ixgbe, but without libpcap.
This fixes are into "next" branch of netmap. Maybe this problem is related.

Can you try with "next" branch?

Thanks,
Stefano

Original comment by stefanog...@gmail.com on 22 Oct 2014 at 11:45

GoogleCodeExporter commented 9 years ago
Hi Stefano:

Jumbo packets does seem to work with the "next" branch of netmap. However, 
here's my observation.

I set "buf_size" to 9000, in dmesg, I see bufsz is set to 9088

[ 2834.998629] 050.285168 [ 460] ixgbe_netmap_configure_srrctl bufsz: 9088 
srrctl: 8

However, I see actual MTU is 8196 Bytes.

Because as soon as I send packets bigger than 8196 bytes, I see fragmentation.
892 bytes is a pretty big difference.

Is buf_size a software buffer size that includes other objects other than skb?

Original comment by morph...@gmail.com on 27 Oct 2014 at 8:06

GoogleCodeExporter commented 9 years ago
Hi morphyno,
bufsz is set to 9088 for alignment reason.

For the MTU, I don't understand if you are set it up to 9000 in both side (tx 
and rx).
Can you explain the test environment? (tools for traffic generation, NIC set 
up, tx and rx NIC interconnection)

Cheers,
Stefano

Original comment by stefanog...@gmail.com on 31 Oct 2014 at 2:17

GoogleCodeExporter commented 9 years ago

Original comment by stefanog...@gmail.com on 31 Oct 2014 at 2:19

GoogleCodeExporter commented 9 years ago
Hi Stefano:

I actually avoided using jumbo packets for some time. I'm using tcpreplay over 
netmap to play pcaps that contain jumbo packets (largest is 9200 bytes). I'm 
able to do so when I set /sys/module/netmap/parameters/buf_size to 9300. I also 
use tcpdump over netmap-libpcap to capture the packets in a loopback (from my 
device under test).

tcpeplay -> Netmap:ethX -> DUT > Netmap:eth7 -> tcpdump

I'm noticing something very interesting. When I attempt to invoke but tcpreplay 
and tcpdump both in netmap mode(they both use difference interfaces). I see the 
following errors in dmesg. 

[2549737.221434] 057.187385 [ 798] generic_netmap_dtor       Restored native NA 
          (null)
[2549738.286908] 058.252026 [1786] netmap_interp_ringid      eth0: tx [0,12) rx 
[0,12) id 0
[2549738.350512] 058.315582 [ 356] nm_mem_assign_group       iommu_group 0
[2549738.416472] 058.381491 [ 507] netmap_obj_malloc         no more netmap_buf 
objects
[2549738.482315] 058.447283 [ 672] netmap_new_bufs           no more buffers 
after 444 of 512
[2549738.548963] 058.513879 [1360] netmap_mem_rings_create   Cannot allocate 
buffers for tx_ring
[2549738.617586] 058.582448 [1516] netmap_mem_global_deref   refcount = 1

This only happens if I have a buffer size set to a very large number. Is there 
any other parameter I need to adjust accordingly with buff_size?

Original comment by morgan.y...@gmail.com on 4 Feb 2015 at 8:06

GoogleCodeExporter commented 9 years ago
Attached is the dmesg, tcpdump over netmap-libpcap is using eth13, tcpreplay 
over netmap is using eth12

Original comment by morgan.y...@gmail.com on 4 Feb 2015 at 8:08

Attachments:

vmaffione commented 7 years ago

Is the problem still there?