emmericp / MoonGen

MoonGen is a fully scriptable high-speed packet generator built on DPDK and LuaJIT. It can saturate a 10 Gbit/s connection with 64 byte packets on a single CPU core while executing user-provided Lua scripts for each packet. Multi-core support allows for even higher rates. It also features precise and accurate timestamping and rate control.
MIT License
1.03k stars 234 forks source link

Mellanox 100G interface segmentation fault at the receiver side #313

Open edgar-costa opened 3 years ago

edgar-costa commented 3 years ago

Hi all,

By following all the issues and docs I managed to get some 100G mellanox (MT28800) apparently work. I installed the OFED LINUX 4.9.2.2.4 drivers and built moongen with the --mlx5 flag as far as I can tell I did not see any error.

I tested the sending side with the libmoon/examples/pktgen.lua script and I manage to get 100gbps sending with 2 ports at the same time and quite small packet sizes.

However, the problem comes when I add a receiver. It always segment faults no matter what I put in the receiving code. I noticed that it receives some packets and thus I made a program to count them, or to see if there was something with the packet content:

function recv(queue, id)
    print("Listening....")
    local bufs = memory.bufArray()
    local count = 0

    while mg.running() do
        local rx = queue:recv(bufs)
        for i = 1, rx do
            count = count + 1
            local pkt = bufs[i]:getUdp4Packet()
            print(pkt.eth:getString() .. " " .. count)

        end
        bufs:freeAll()
    end
    print(id .. " Total Packets Received: " .. count)
end

I observed that after receiving the amount of rxDescs packets it crashes with segmentation fault. For example with a rxdescs of 512 the program above prints:

...
ETH 98:03:9b:4d:d7:9c > 98:03:9b:4d:d7:9d type 0x0800 (IP4) 503
ETH 98:03:9b:4d:d7:9c > 98:03:9b:4d:d7:9d type 0x0800 (IP4) 504
ETH 98:03:9b:4d:d7:9c > 98:03:9b:4d:d7:9d type 0x0800 (IP4) 505
ETH 98:03:9b:4d:d7:9c > 98:03:9b:4d:d7:9d type 0x0800 (IP4) 506
ETH 98:03:9b:4d:d7:9c > 98:03:9b:4d:d7:9d type 0x0800 (IP4) 507
ETH 98:03:9b:4d:d7:9c > 98:03:9b:4d:d7:9d type 0x0800 (IP4) 508
ETH 98:03:9b:4d:d7:9c > 98:03:9b:4d:d7:9d type 0x0800 (IP4) 509
ETH 98:03:9b:4d:d7:9c > 98:03:9b:4d:d7:9d type 0x0800 (IP4) 510
ETH 98:03:9b:4d:d7:9c > 98:03:9b:4d:d7:9d type 0x0800 (IP4) 511
ETH 98:03:9b:4d:d7:9c > 98:03:9b:4d:d7:9d type 0x0800 (IP4) 512
Segmentation fault

Or if i change it to lets say 4096:

...
ETH 98:03:9b:4d:d7:9c > 98:03:9b:4d:d7:9d type 0x0800 (IP4) 4088
ETH 98:03:9b:4d:d7:9c > 98:03:9b:4d:d7:9d type 0x0800 (IP4) 4089
ETH 98:03:9b:4d:d7:9c > 98:03:9b:4d:d7:9d type 0x0800 (IP4) 4090
ETH 98:03:9b:4d:d7:9c > 98:03:9b:4d:d7:9d type 0x0800 (IP4) 4091
ETH 98:03:9b:4d:d7:9c > 98:03:9b:4d:d7:9d type 0x0800 (IP4) 4092
ETH 98:03:9b:4d:d7:9c > 98:03:9b:4d:d7:9d type 0x0800 (IP4) 4093
ETH 98:03:9b:4d:d7:9c > 98:03:9b:4d:d7:9d type 0x0800 (IP4) 4094
ETH 98:03:9b:4d:d7:9c > 98:03:9b:4d:d7:9d type 0x0800 (IP4) 4095
Segmentation fault

Any idea of what could the problem be?