Open dujiang426 opened 3 years ago
1500+/s sounds much!
would you compare the amount of data(requests) being captured by two versions?
try some buffer tweaks like -input-raw-buffer-size 32mb
, input-raw-buffer-timeout 1m
@dujiang426 it will better if you provide metrics of the server processing that loads and the specification of the machine.
1500+/s sounds much! would you compare the amount of data(requests) being captured by two versions? try some buffer tweaks like
-input-raw-buffer-size 32mb
,input-raw-buffer-timeout 1m
I use the same environment and stress test script, the request is very simple, there is no other traffic interference
@dujiang426 I meant the number of requests/data captured by v1.1.0 versus master branch? Usually, the amount of data being processed is proportional to CPU/memory usage.
That's defo not expected to have such CPU difference. We will investigate asap. Can you give exact goreplay command you use, and tell us about traffic pattern you have. Is it mostly a GET requests? Maybe a few example requests?
Thanks!
@urbanishimwe @buger below is my execution environment and related information.
compiler environment
goreplay master 40a3c51c9c052afa7927aaebb383f1108a5d6bfd
go version go1.15.3 linux/amd64
libpcap 1.9.1
go build -o gor -ldflags "-s -w"
tested application environment
cpu 4c, mem 8g
gor start command
gor --input-raw :8080 --http-allow-url /gor --output-kafka-host '192.168.0.1:9092' --output-kafka-topic 'kafka-log' --http-set-header 'key:value'
jmeter script
Number of Threads(users): 100
Ramp-up period(seconds): 1
Loop Count: 5000
$curl http://127.0.0.1:8080/gor/test
success
my self-compiled version before the start
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
45159 root 20 0 958084 28168 18576 S 0.7 0.3 0:00.44 gor
after the start
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
45159 root 20 0 1166616 38960 20752 S 33.3 0.5 0:26.18 gor
jmeter output
Starting standalone test @ Thu Dec 03 11:40:36 CST 2020 (1606966836657)
Waiting for possible Shutdown/StopTestNow/HeapDump/ThreadDump message on port 4445
summary + 31895 in 00:00:23 = 1389.6/s Avg: 70 Min: 56 Max: 739 Err: 0 (0.00%) Active: 100 Started: 100 Finished: 0
summary + 43302 in 00:00:30 = 1443.4/s Avg: 69 Min: 57 Max: 363 Err: 0 (0.00%) Active: 100 Started: 100 Finished: 0
summary = 75197 in 00:00:53 = 1420.1/s Avg: 69 Min: 56 Max: 739 Err: 0 (0.00%)
summary + 43449 in 00:00:30 = 1448.3/s Avg: 69 Min: 57 Max: 363 Err: 0 (0.00%) Active: 100 Started: 100 Finished: 0
summary = 118646 in 00:01:23 = 1430.3/s Avg: 69 Min: 56 Max: 739 Err: 0 (0.00%)
summary + 43007 in 00:00:30 = 1433.6/s Avg: 69 Min: 57 Max: 379 Err: 0 (0.00%) Active: 100 Started: 100 Finished: 0
summary = 161653 in 00:01:53 = 1431.2/s Avg: 69 Min: 56 Max: 739 Err: 0 (0.00%)
summary + 43270 in 00:00:30 = 1442.3/s Avg: 69 Min: 57 Max: 346 Err: 0 (0.00%) Active: 100 Started: 100 Finished: 0
summary = 204923 in 00:02:23 = 1433.5/s Avg: 69 Min: 56 Max: 739 Err: 0 (0.00%)
summary + 43278 in 00:00:30 = 1442.6/s Avg: 69 Min: 56 Max: 395 Err: 0 (0.00%) Active: 100 Started: 100 Finished: 0
summary = 248201 in 00:02:53 = 1435.1/s Avg: 69 Min: 56 Max: 739 Err: 0 (0.00%)
summary + 43754 in 00:00:30 = 1458.5/s Avg: 68 Min: 57 Max: 358 Err: 0 (0.00%) Active: 100 Started: 100 Finished: 0
summary = 291955 in 00:03:23 = 1438.5/s Avg: 69 Min: 56 Max: 739 Err: 0 (0.00%)
summary + 44050 in 00:00:30 = 1468.3/s Avg: 68 Min: 56 Max: 361 Err: 0 (0.00%) Active: 100 Started: 100 Finished: 0
summary = 336005 in 00:03:53 = 1442.4/s Avg: 69 Min: 56 Max: 739 Err: 0 (0.00%)
summary + 43776 in 00:00:30 = 1459.2/s Avg: 68 Min: 57 Max: 359 Err: 0 (0.00%) Active: 100 Started: 100 Finished: 0
summary = 379781 in 00:04:23 = 1444.3/s Avg: 69 Min: 56 Max: 739 Err: 0 (0.00%)
summary + 43263 in 00:00:30 = 1442.1/s Avg: 69 Min: 57 Max: 375 Err: 0 (0.00%) Active: 100 Started: 100 Finished: 0
summary = 423044 in 00:04:53 = 1444.1/s Avg: 69 Min: 56 Max: 739 Err: 0 (0.00%)
summary + 42337 in 00:00:30 = 1411.2/s Avg: 70 Min: 57 Max: 367 Err: 0 (0.00%) Active: 100 Started: 100 Finished: 0
summary = 465381 in 00:05:23 = 1441.0/s Avg: 69 Min: 56 Max: 739 Err: 0 (0.00%)
summary + 33501 in 00:00:30 = 1116.7/s Avg: 69 Min: 56 Max: 319 Err: 0 (0.00%) Active: 23 Started: 100 Finished: 77
summary = 498882 in 00:05:53 = 1413.5/s Avg: 69 Min: 56 Max: 739 Err: 0 (0.00%)
summary + 1118 in 00:00:11 = 104.0/s Avg: 70 Min: 60 Max: 196 Err: 0 (0.00%) Active: 0 Started: 100 Finished: 100
summary = 500000 in 00:06:04 = 1374.7/s Avg: 69 Min: 56 Max: 739 Err: 0 (0.00%)
Tidying up ... @ Thu Dec 03 11:46:40 CST 2020 (1606967200754)
... end of run
eventually capture traffic:498477
1.1.0 version before the start
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
51664 root 20 0 1478648 21424 11924 S 0.3 0.3 0:00.10 gor
after the start
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
51664 root 20 0 1682500 37596 15700 S 5.0 0.4 0:14.26 gor
jmeter output
Starting standalone test @ Thu Dec 03 11:53:24 CST 2020 (1606967604944)
Waiting for possible Shutdown/StopTestNow/HeapDump/ThreadDump message on port 4445
summary + 5636 in 00:00:05 = 1216.5/s Avg: 71 Min: 58 Max: 618 Err: 0 (0.00%) Active: 100 Started: 100 Finished: 0
summary + 44133 in 00:00:30 = 1471.1/s Avg: 67 Min: 57 Max: 382 Err: 0 (0.00%) Active: 100 Started: 100 Finished: 0
summary = 49769 in 00:00:35 = 1437.0/s Avg: 68 Min: 57 Max: 618 Err: 0 (0.00%)
summary + 44128 in 00:00:30 = 1470.8/s Avg: 67 Min: 56 Max: 338 Err: 0 (0.00%) Active: 100 Started: 100 Finished: 0
summary = 93897 in 00:01:05 = 1452.7/s Avg: 68 Min: 56 Max: 618 Err: 0 (0.00%)
summary + 43881 in 00:00:30 = 1462.7/s Avg: 68 Min: 56 Max: 351 Err: 0 (0.00%) Active: 100 Started: 100 Finished: 0
summary = 137778 in 00:01:35 = 1455.9/s Avg: 68 Min: 56 Max: 618 Err: 0 (0.00%)
summary + 43009 in 00:00:30 = 1433.7/s Avg: 69 Min: 57 Max: 361 Err: 0 (0.00%) Active: 100 Started: 100 Finished: 0
summary = 180787 in 00:02:05 = 1450.6/s Avg: 68 Min: 56 Max: 618 Err: 0 (0.00%)
summary + 42602 in 00:00:30 = 1420.1/s Avg: 70 Min: 57 Max: 1120 Err: 0 (0.00%) Active: 100 Started: 100 Finished: 0
summary = 223389 in 00:02:35 = 1444.6/s Avg: 68 Min: 56 Max: 1120 Err: 0 (0.00%)
summary + 42854 in 00:00:30 = 1428.4/s Avg: 69 Min: 57 Max: 293 Err: 0 (0.00%) Active: 100 Started: 100 Finished: 0
summary = 266243 in 00:03:05 = 1442.0/s Avg: 69 Min: 56 Max: 1120 Err: 0 (0.00%)
summary + 43700 in 00:00:30 = 1456.7/s Avg: 68 Min: 57 Max: 1067 Err: 1 (0.00%) Active: 99 Started: 100 Finished: 1
summary = 309943 in 00:03:35 = 1444.1/s Avg: 69 Min: 56 Max: 1120 Err: 1 (0.00%)
summary + 42938 in 00:00:30 = 1431.3/s Avg: 69 Min: 57 Max: 347 Err: 0 (0.00%) Active: 99 Started: 100 Finished: 1
summary = 352881 in 00:04:05 = 1442.5/s Avg: 69 Min: 56 Max: 1120 Err: 1 (0.00%)
summary + 43322 in 00:00:30 = 1444.1/s Avg: 68 Min: 57 Max: 341 Err: 0 (0.00%) Active: 99 Started: 100 Finished: 1
summary = 396203 in 00:04:35 = 1442.7/s Avg: 68 Min: 56 Max: 1120 Err: 1 (0.00%)
summary + 43448 in 00:00:30 = 1448.3/s Avg: 68 Min: 56 Max: 368 Err: 0 (0.00%) Active: 99 Started: 100 Finished: 1
summary = 439651 in 00:05:05 = 1443.2/s Avg: 68 Min: 56 Max: 1120 Err: 1 (0.00%)
summary + 43550 in 00:00:30 = 1451.7/s Avg: 67 Min: 57 Max: 360 Err: 0 (0.00%) Active: 89 Started: 100 Finished: 11
summary = 483201 in 00:05:35 = 1444.0/s Avg: 68 Min: 56 Max: 1120 Err: 1 (0.00%)
summary + 14800 in 00:00:24 = 626.5/s Avg: 68 Min: 57 Max: 235 Err: 0 (0.00%) Active: 0 Started: 100 Finished: 100
summary = 498001 in 00:05:58 = 1390.1/s Avg: 68 Min: 56 Max: 1120 Err: 1 (0.00%)
Tidying up ... @ Thu Dec 03 11:59:23 CST 2020 (1606967963625)
... end of run
eventually capture traffic:497787
this is very unexpected, for such a low amount of data.
Hi!
@dujiang426 can you also specify your Linux version (distribution and kernel version)?
Thank you!
my gor compiler environment
Linux version 3.10.0-327.ali2018.alios7.x86_64 (gcc version 4.8.5 20150623 (Red Hat 4.8.5-4) (GCC) ) #1 SMP Mon Jul 23 15:08:28 CST 2018
test application environment
Linux version 4.9.151-015.ali3000.alios7.x86_64 (gcc version 6.4.0 20170704 (Red Hat 6.4.0-1) (GCC) ) #1 SMP Tue Mar 12 19:10:26 CST 2019
The CPU usage of the version I compiled based on the latest code of master is about 4 times higher than the previous version 1.1.0
My libpcap is using version 1.9.1, the qps of the application is 1500+/s, the CPU occupancy of version 1.10 is about 10%, and the version I compiled is about 40%