mfontanini / libtins

High-level, multiplatform C++ network packet sniffing and crafting library.
http://libtins.github.io/
BSD 2-Clause "Simplified" License
1.91k stars 375 forks source link

Tcp stream capture and packet lost #351

Open yeahlow opened 5 years ago

yeahlow commented 5 years ago

Hello everyone, when I was coding with libtins, I got a problem of tcp stream capture and packet lost. The test scenario is as follows:

  1. Prepare two test host.

One as server (IP: 172.16.31.59 port: 38685 OS: Solaris 10 Sparc 64bit) One as client (IP: 172.16.31.134 OS: CentOS 7.5.1804 64bit)

Batch TCP messages for testing (actually Diameter messages) are sent from Server to Client.

  1. Capture test program is deployed on 172.16.31.134 host, capturing eth0 network card (172.16.31.134 network card name), port 38685 all TCP messages.

  2. Capture test code is modified on the basis of the example code 'stream_dump.cpp'(master, version 343) of libtins (very simple, capture the tcp stream directly, and parse it into Diameter messages one by one).

But during performance testing, a strange phenomenon of packet lost was found:

  1. If the concurrent pressure is 500/sec (sending 500, receiving 500). Then the statistics are normal, and the log can output the message with new connection as follows (only one record):

[+] New connection: 172.16.31.134:10399-172.16.31.59:38685

At the same time, the Stream data captured or the parsed package data are also accurate:

2019-05-21 09:41:48 Server Stream [172.16.31.59:38685]: 1324/0, Client Stream [172.16.31.134:10399]: 5010/0, Packet: 5010/5012/0, SNR: 5010, SNA: 5012/0

  1. If the concurrency pressure was increased to 1000 per second (sending 1000, receiving 1000). Packet loss was very serious and strage:

First, there would be a lot of messages about new connection in the log, and the Server and Client terminals would change frequently, as follows(in my opinion, there should be only one, right?):

(09:56:05.220555 [+] New connection : 172.16.31.134:11011 - 172.16.31.59:38685 (09:56:05.782502 [+] New connection : 172.16.31.59:38685 - 172.16.31.134:11011 (09:56:06.371284 [+] New connection : 172.16.31.134:11011 - 172.16.31.59:38685 (09:56:06.977457 [+] New connection : 172.16.31.134:11011 - 172.16.31.59:38685 (09:56:07.571982 [+] New connection : 172.16.31.134:11011 - 172.16.31.59:38685 (09:56:08.161813 [+] New connection : 172.16.31.134:11011 - 172.16.31.59:38685 (09:56:08.728867 [+] New connection : 172.16.31.134:11011 - 172.16.31.59:38685 (09:56:09.322021 [+] New connection : 172.16.31.134:11011 - 172.16.31.59:38685 (09:56:09.921321 [+] New connection : 172.16.31.134:11011 - 172.16.31.59:38685 (09:56:10.509554 [+] New connection : 172.16.31.134:11011 - 172.16.31.59:38685 (09:56:11.128241 [+] New connection : 172.16.31.134:11011 - 172.16.31.59:38685 (09:56:11.700519 [+] New connection : 172.16.31.134:11011 - 172.16.31.59:38685 (09:56:12.286710 [+] New connection : 172.16.31.134:11011 - 172.16.31.59:38685 (09:56:12.784303 [+] New connection : 172.16.31.134:11011 - 172.16.31.59:38685 (09:56:13.387077 [+] New connection : 172.16.31.134:11011 - 172.16.31.59:38685 (09:56:13.987350 [+] New connection : 172.16.31.134:11011 - 172.16.31.59:38685 (09:56:14.558632 [+] New connection : 172.16.31.59:38685 - 172.16.31.134:11011

Most importantly, the performance test results show that packet lost is serious, and the 'request' message is serious (90%), but the reply message is all right (almost 100%), for example:

2019-05-21 09:56:08 Server Stream [172.16.31.59:38685]: 2362/0, Client Stream [172.16.31.134:11011]: 1133/0, Packet: 924/9989/0, SNR: 924, SNA: 9989

The 924 above is the request message (but normally the data should be 10,000), while 9989 is the reply message (almost 100%).

One more thing, The request message(senting) is about 260 bytes, the reply message is about 156 bytes(all is normal, not too long).

can you help me to resolve this problem?