Closed thiagomata closed 3 months ago
As a reference, here is a simple comparison of performance from the same GRPC Server interacting with the same GRPClient:
Both tests use GRPCurl as a client.
grpcurl -max-time "0.1" -plaintext -d '
{
"requestHeaders": {
"attributes": {},
"headers": {
"headers": [
{
"key": ":authority",
"value": "'localhost:9090'"
},
{
"key": ":path",
"value": "/example"
},
{
"key": ":scheme",
"value": "http"
},
{
"key": ":method",
"value": "GET"
},
{
"key": "User-Agent",
"value": "'bot'"
}
]
}
}
}
' "localhost:9090" envoy.service.ext_proc.v3.ExternalProcessor.Process
rm *.log
timeout 200 bash -c "seq 1000000000 | xargs -I -- ./scripts/grpc-health.sh >> grpc-health01.log" &
timeout 200 bash -c "seq 1000000000 | xargs -I -- ./scripts/grpc-health.sh >> grpc-health02.log" &
timeout 200 bash -c "seq 1000000000 | xargs -I -- ./scripts/grpc-health.sh >> grpc-health03.log" &
timeout 200 bash -c "seq 1000000000 | xargs -I -- ./scripts/grpc-health.sh >> grpc-health04.log" &
timeout 200 bash -c "seq 1000000000 | xargs -I -- ./scripts/grpc-health.sh >> grpc-health05.log" &
timeout 200 bash -c "seq 1000000000 | xargs -I -- ./scripts/grpc-health.sh >> grpc-health06.log" &
timeout 200 bash -c "seq 1000000000 | xargs -I -- ./scripts/grpc-health.sh >> grpc-health07.log" &
timeout 200 bash -c "seq 1000000000 | xargs -I -- ./scripts/grpc-health.sh >> grpc-health08.log" &
timeout 200 bash -c "seq 1000000000 | xargs -I -- ./scripts/grpc-health.sh >> grpc-health09.log" &
timeout 200 bash -c "seq 1000000000 | xargs -I -- ./scripts/grpc-health.sh >> grpc-health10.log" &
seq 30 | xargs -I -- bash -c "sleep 10 && cat grpc-health*.log | grep PROCESSED | wc -l && date"
msgs | date |
---|---|
2274 | Wed 19 Jun 2024 15:19:49 AEST |
4540 | Wed 19 Jun 2024 15:19:59 AEST |
6713 | Wed 19 Jun 2024 15:20:09 AEST |
8837 | Wed 19 Jun 2024 15:20:19 AEST |
10784 | Wed 19 Jun 2024 15:20:29 AEST |
12799 | Wed 19 Jun 2024 15:20:39 AEST |
14495 | Wed 19 Jun 2024 15:20:49 AEST |
16399 | Wed 19 Jun 2024 15:20:59 AEST |
18285 | Wed 19 Jun 2024 15:21:10 AEST |
20096 | Wed 19 Jun 2024 15:21:20 AEST |
22173 | Wed 19 Jun 2024 15:21:30 AEST |
24303 | Wed 19 Jun 2024 15:21:40 AEST |
26251 | Wed 19 Jun 2024 15:21:50 AEST |
28175 | Wed 19 Jun 2024 15:22:01 AEST |
29851 | Wed 19 Jun 2024 15:22:11 AEST |
31919 | Wed 19 Jun 2024 15:22:21 AEST |
33756 | Wed 19 Jun 2024 15:22:31 AEST |
35807 | Wed 19 Jun 2024 15:22:42 AEST |
37931 | Wed 19 Jun 2024 15:22:52 AEST |
39311 | Wed 19 Jun 2024 15:23:02 AEST |
msgs | add | date |
---|---|---|
2274 | 15:19:49 | |
4540 | 2266 | 15:19:59 |
6713 | 2173 | 15:20:09 |
8837 | 2124 | 15:20:19 |
10784 | 1947 | 15:20:29 |
12799 | 2015 | 15:20:39 |
14495 | 1696 | 15:20:49 |
16399 | 1904 | 15:20:59 |
18285 | 1886 | 15:21:10 |
20096 | 1811 | 15:21:20 |
22173 | 2077 | 15:21:30 |
24303 | 2130 | 15:21:40 |
26251 | 1948 | 15:21:50 |
28175 | 1924 | 15:22:01 |
29851 | 1676 | 15:22:11 |
31919 | 2068 | 15:22:21 |
33756 | 1837 | 15:22:31 |
35807 | 2051 | 15:22:42 |
37931 | 2124 | 15:22:52 |
39311 | 1380 | 15:23:02 |
total msgs | total duration |
---|---|
37037 | 00:03:13 |
msg/second |
---|
192 |
#!/usr/bin/env bash
# grpcurl load messages
message='
{
"requestHeaders": {
"attributes": {},
"headers": {
"headers": [
{
"key": ":authority",
"value": "localhost:9090"
},
{
"key": ":path",
"value": "/example"
},
{
"key": ":scheme",
"value": "http"
},
{
"key": ":method",
"value": "GET"
},
{
"key": "User-Agent",
"value": "'bot'"
}
]
}
}
}
'
# ensure that each json object is on a single line
# that avoids the error on parsing partial messages during the timeout
line=$(echo "$message" | jq -c)
while true; do
echo "$line"
done
rm *.log
timeout 200 ./scripts/grpcurl-load-messages.sh | \
grpcurl -plaintext -d @ "localhost:9090" envoy.service.ext_proc.v3.ExternalProcessor.Process >> \
stream_result.log &
seq 30 | xargs -I -- bash -c "sleep 10 && cat stream_result.log | grep 'PROCESSED' | wc -l && date"
msgs | date |
---|---|
268251 | Wed 19 Jun 2024 15:51:45 AEST |
555624 | Wed 19 Jun 2024 15:51:57 AEST |
869601 | Wed 19 Jun 2024 15:52:10 AEST |
1210794 | Wed 19 Jun 2024 15:52:24 AEST |
1576059 | Wed 19 Jun 2024 15:52:40 AEST |
1983550 | Wed 19 Jun 2024 15:52:57 AEST |
2404755 | Wed 19 Jun 2024 15:53:16 AEST |
2879429 | Wed 19 Jun 2024 15:53:35 AEST |
3386229 | Wed 19 Jun 2024 15:53:58 AEST |
3918339 | Wed 19 Jun 2024 15:54:21 AEST |
4508434 | Wed 19 Jun 2024 15:54:48 AEST |
msgs | diff | date |
---|---|---|
268251 | 15:51:45 | |
555624 | 287373 | 15:51:57 |
869601 | 313977 | 15:52:10 |
1210794 | 341193 | 15:52:24 |
1576059 | 365265 | 15:52:40 |
1983550 | 407491 | 15:52:57 |
2404755 | 421205 | 15:53:16 |
2879429 | 474674 | 15:53:35 |
3386229 | 506800 | 15:53:58 |
3918339 | 532110 | 15:54:21 |
4508434 | 590095 | 15:54:48 |
total msgs | total duration | |
---|---|---|
4542905 | 0:03:03 | 122x more messages |
msg / second | |
---|---|
24825 | 129x faster! |
I have tested different combinations of number of clients process, but the results are quite similar. The stream approach is much faster than the single message approach.
This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.
This issue has been automatically closed because it has not had activity in the last 37 days. If this issue is still valid, please ping a maintainer and ask them to label it as "help wanted" or "no stalebot". Thank you for your contributions.
Title: Envoy Filter Only Send one Message Per Stream
Description: We are using envoy filters
That filter calls our GRPC server, which measures some metrics, including the number of messages per stream. When we load test our GRPC server locally, using GRPCurl, for example, we can see the number of messages in the same stream is around 10k for a 5-second load test. But when we measure the stream size of the actual GRPC calls from the Envoy filter, all are size 2. Also, the number of EOFs matches the number of calls to the GRPC server. So, there are only two messages in each stream, one of which is an EOF.
It looks like we are not using the best of the streaming capability during these calls, which can reuse the same GRPC request to send multiple messages, creating new streams for each message instead. If we do that properly, we may considerably reduce the latency around network protocols shaking hands.