Open adw555 opened 8 years ago
@adw555 Yes, I rewrote RawSocket to use ljsyscall in January, although I think it has been patched a few times later. When I rewrote I only validated the selftest kept consistent bu I didn't try example_spray. I will take a look. Thanks for reporting!
@adw555 I have checked that the amount of transmitted packet is significantly lower in the May release than in the October release. I run example_spray reading a pcap file, instead of an interface, and writing to a file.
In my tests I used a pcap file of 40K packets:
$ capinfos v4v6.pcap | grep "Number"
Number of packets: 40 k
Then I run example_spray
from October to May.
$ sudo ./snabb example_spray v4v6.pcap /tmp/output.pcap
Here are my results:
v2015.10
link report:
40,000 sent on capture.output -> spray_app.input (loss rate: 0%)
20,000 sent on spray_app.output -> output_file.input (loss rate: 0%)
v2015.11
link report:
40,000 sent on capture.output -> spray_app.input (loss rate: 0%)
20,000 sent on spray_app.output -> output_file.input (loss rate: 0%)
v2015.12
link report:
40,000 sent on capture.output -> spray_app.input (loss rate: 0%)
20,000 sent on spray_app.output -> output_file.input (loss rate: 0%)
v2016.01
link report:
36,465 sent on capture.output -> spray_app.input (loss rate: 0%)
18,232 sent on spray_app.output -> output_file.input (loss rate: 0%)
v2016.02
link report:
39,780 sent on capture.output -> spray_app.input (loss rate: 0%)
19,890 sent on spray_app.output -> output_file.input (loss rate: 0%)
v2016.03
link report:
40,000 sent on capture.output -> spray_app.input (loss rate: 0%)
20,000 sent on spray_app.output -> output_file.input (loss rate: 0%)
v2016.04
link report:
40,000 sent on capture.output -> spray_app.input (loss rate: 0%)
20,000 sent on spray_app.output -> output_file.input (loss rate: 0%)
v2016.05
link report:
9,690 sent on capture.output -> spray_app.input (loss rate: 0%)
4,845 sent on spray_app.output -> output_file.input (loss rate: 0%)
Could you give it a try to the April release? If it's working OK, then there's something that has changed in v2016.05.
That is interesting since only the selftest changed between 04 and 05, e.g. must be a change in a dependency.
% git diff v2016.04...v2016.05 apps/socket/
diff --git a/src/apps/socket/raw.lua b/src/apps/socket/raw.lua
index f7e6dc7..a6066b3 100644
--- a/src/apps/socket/raw.lua
+++ b/src/apps/socket/raw.lua
@@ -80,16 +80,20 @@ end
function selftest ()
-- Send a packet over the loopback device and check
-- that it is received correctly.
- -- XXX Beware of a race condition with unrelated traffic over the
- -- loopback device.
local datagram = require("lib.protocol.datagram")
local ethernet = require("lib.protocol.ethernet")
local ipv6 = require("lib.protocol.ipv6")
-
- -- Initialize RawSocket.
- local lo = RawSocket:new("lo")
- lo.input, lo.output = {}, {}
- lo.input.rx, lo.output.tx = link.new("test1"), link.new("test2")
+ local Match = require("apps.test.match").Match
+
+ -- Initialize RawSocket and Match.
+ local c = config.new()
+ config.app(c, "lo", RawSocket, "lo")
+ config.app(c, "match", Match, {fuzzy=true})
+ config.link(c, "lo.tx->match.rx")
+ engine.configure(c)
+ local link_in, link_cmp = link.new("test_in"), link.new("test_cmp")
+ engine.app_table.lo.input.rx = link_in
+ engine.app_table.match.input.comparator = link_cmp
-- Construct packet.
local dg_tx = datagram:new()
local src = ethernet:pton("02:00:00:00:00:01")
@@ -99,22 +103,16 @@ function selftest ()
dst = localhost,
next_header = 59, -- No next header.
hop_limit = 1}))
- dg_tx:push(ethernet:new({src = src,
- dst = dst,
+ dg_tx:push(ethernet:new({src = src,
+ dst = dst,
type = 0x86dd}))
- -- Transmit packet.
- link.transmit(lo.input.rx, dg_tx:packet())
- lo:push()
- -- Receive packet.
- lo:pull()
- local dg_rx = datagram:new(link.receive(lo.output.tx), ethernet)
- -- Assert packet was received OK.
- assert(dg_rx:parse({{ethernet, function(eth)
- return(eth:src_eq(src) and eth:dst_eq(dst) and eth:type() == 0x86dd)
- end }, { ipv6, function(ipv6)
- return(ipv6:src_eq(localhost) and ipv6:dst_eq(localhost))
- end } }), "loopback test failed")
- lo:stop()
+ -- Transmit packets.
+ link.transmit(link_in, dg_tx:packet())
+ link.transmit(link_cmp, packet.clone(dg_tx:packet()))
+ engine.app_table.lo:push()
+ -- Run engine.
+ engine.main({duration = 0.01, report = {showapps=true,showlinks=true}})
+ assert(#engine.app_table.match:errors() == 0)
print("selftest passed")
-- XXX Another useful test would be to feed a pcap file with
For me, working from the latest release backwards, using RawSocket I don't get correct execution until the snabb-2016.02 release. For all releases after February, the packet throughput is minimal for my application.
Just some figures to illustrate. My application is listening on eth0 using a RawSocket and piping the packets to the input of my application. Working correctly I will get just over a 1000 packets in a minute on this test server. You can seel all is well in the February release, but not from March onwards:
snabb-2016.02
Main Report: link report: 1,038 sent on interface.tx -> amqp_app.input (loss rate: 0%)
snabb-2016.03
Main Report: link report: 0 sent on interface.tx -> amqp_app.input (loss rate: 0%)
snabb-2016.04
Main Report: link report: 0 sent on interface.tx -> amqp_app.input (loss rate: 0%)
snabb-2016.04.1
Main Report: link report: 15 sent on interface.tx -> amqp_app.input (loss rate: 0%)
snabb-2016.05
Main Report: link report: 15 sent on interface.tx -> amqp_app.input (loss rate: 0%)
I don't manage to run example_spray
on eth0
. Maybe I'm doing something wrong. This is how I try to run it:
(v2016.02) $ sudo ./snabb example_spray eth0 /tmp/output.pcap
lib/pcap/pcap.lua:56: Unable to open file: eth0
stack traceback:
core/main.lua:126: in function <core/main.lua:124>
[C]: in function 'error'
lib/pcap/pcap.lua:56: in function 'records'
apps/pcap/pcap.lua:13: in function 'new'
It works though on a tap interface.
$ sudo ip tuntap add tap0 mode tap
$ $ sudo ./snabb example_spray tap0 /tmp/output.pcap
link report:
21 sent on capture.output -> spray_app.input (loss rate: 0%)
10 sent on spray_app.output -> output_file.input (loss rate: 0%)
Any hints?
So I changed the standard example_spray code to use a RawSocket instead of a PcapReader as it then does the same as my real application. You see below I've commented out the pcapReader and used a RawSocket instead. You can see the difference between 2016-02 and 2016-03 release results.
module(..., package.seeall)
local pcap = require("apps.pcap.pcap") local sprayer = require("program.example_spray.sprayer") local raw = require("apps.socket.raw")
function run (parameters) if not (#parameters == 2) then print("Usage: example_spray
local c = config.new() --config.app(c, "capture", pcap.PcapReader, input) config.app(c, "capture", raw.RawSocket, input) config.app(c, "spray_app", sprayer.Sprayer) config.app(c, "output_file", pcap.PcapWriter, output)
config.link(c, "capture.tx -> spray_app.input") config.link(c, "spray_app.output -> output_file.input")
engine.configure(c) engine.main({duration=60, report = {showlinks=true}}) end
test:~/Workspace/snabb-2016.02$ sudo src/snabb example_spray eth0 /tmp/test1.pcap link report: 1,485 sent on capture.tx -> spray_app.input (loss rate: 0%) 742 sent on spray_app.output -> output_file.input (loss rate: 0%)
test:~/Workspace/snabb-2016.03$ sudo src/snabb example_spray eth0 /tmp/test5.pcap link report: 0 sent on capture.tx -> spray_app.input (loss rate: 0%) 0 sent on spray_app.output -> output_file.input (loss rate: 0%)
I found out the reason for the "regression" between v2016.04 and v2016.05. The regression was introduced in #882 when max_packets
increased from 1e5 to 1e6. 1e5 gets better results in this case.
v2016.04
(v2016.04) $ sudo ./snabb snsh -p example_spray v4v6.pcap /tmp/output.pcap
link report:
40,000 sent on capture.output -> spray_app.input (loss rate: 0%)
20,000 sent on spray_app.output -> output_file.input (loss rate: 0%)
v2016.05
(v2016.05) $ sudo ./snabb snsh -p example_spray v4v6.pcap /tmp/output.pcap
link report:
15,300 sent on capture.output -> spray_app.input (loss rate: 0%)
7,650 sent on spray_app.output -> output_file.input (loss rate: 0%)
v4v6.pcap is a mix of IPv4 and IPv6 packets. It contains 40K packets. The file can be downloaded here: http://http://people.igalia.com/dpino/v4v6.pcap
I also increased the duration of example_spray
up to 10 seconds, to give it time to the script to process the whole file.
Back in October I modifed the example_spray program to use an RawSocket as the input rather than the PcapReader. All worked well, but when I run the exact same code with the latest version of snabb switch, I get an initial burst of packets then none after that. See below for output. I have the program running in a loop and outputting the stats every 60seconds
October version of snabb switch - you can see the packet counts increasing every minute.
test:/tmp$ sudo ./snabb example_spray eth0 /tmp/test3.pcap Main Report: link report: 1,583 sent on capture.tx -> spray_app.input (loss rate: 0%) 791 sent on spray_app.output -> output_file.input (loss rate: 0%) Main Report: link report: 3,106 sent on capture.tx -> spray_app.input (loss rate: 0%) 1,553 sent on spray_app.output -> output_file.input (loss rate: 0%) Main Report: link report: 4,451 sent on capture.tx -> spray_app.input (loss rate: 0%) 2,225 sent on spray_app.output -> output_file.input (loss rate: 0%)
Latest version of snabb switch- you can see the counts do not increase and are smaller than above.
test:/tmp$ sudo ./snabb example_spray eth0 /tmp/test4.pcap Main Report: link report: 27 sent on capture.tx -> spray_app.input (loss rate: 0%) 13 sent on spray_app.output -> output_file.input (loss rate: 0%) Main Report: link report: 27 sent on capture.tx -> spray_app.input (loss rate: 0%) 13 sent on spray_app.output -> output_file.input (loss rate: 0%) Main Report: link report: 27 sent on capture.tx -> spray_app.input (loss rate: 0%) 13 sent on spray_app.output -> output_file.input (loss rate: 0%)
Has RawSocket changed in that way I'm supposed to use it? I can see it was rewritten to use ljsyscall in November.