I have used https://ghz.sh to assess the performance of nameko-grpc compared to the official gRPC implementation, and also inspected where the service was spending its time using https://github.com/benfred/py-spy.
The test is against a trivial unary request and response endpoint that does almost nothing in the service method. The work on the server is almost exclusively overhead of handling requests.
Results below, but TL;DR:
As of 1.2.0rc nameko-grpc was about 15x slower than the official Python gRPC implementation
An enormous amount of time was spent re-generating the Inspector object in each request 🙈
Adding a trivial cache to this brings nameko-grpc improves performance by a factor of 7, to just shy of 2x slower than the official implementation
Two scripts used to run the services under test are included in this PR.
gRPC official client
╰─ ghz -n 10000 --insecure --proto ./example.proto --call nameko.example.unary_unary -d '{"value": "A"}' localhost:50052
Summary:
Count: 10000
Total: 2.42 s
Slowest: 23.14 ms
Fastest: 6.23 ms
Average: 12.06 ms
Requests/sec: 4124.18
Response time histogram:
6.231 [1] |
7.922 [8] |
9.613 [644] |∎∎∎∎∎∎∎
11.304 [3455] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
12.995 [3157] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
14.686 [1684] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
16.377 [759] |∎∎∎∎∎∎∎∎∎
18.068 [184] |∎∎
19.759 [29] |
21.450 [64] |∎
23.141 [15] |
Latency distribution:
10 % in 9.80 ms
25 % in 10.58 ms
50 % in 11.71 ms
75 % in 13.18 ms
90 % in 14.77 ms
95 % in 15.62 ms
99 % in 18.22 ms
Status code distribution:
[OK] 10000 responses
Nameko, without changes in this PR:
╰─ ghz -n 10000 --insecure --proto ./example.proto --call nameko.example.unary_unary -d '{"value": "A"}' localhost:50051
Summary:
Count: 10000
Total: 34.89 s
Slowest: 410.23 ms
Fastest: 90.82 ms
Average: 174.09 ms
Requests/sec: 286.59
Response time histogram:
90.822 [1] |
122.762 [245] |∎∎∎
154.703 [3831] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
186.643 [1613] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
218.584 [3244] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
250.524 [736] |∎∎∎∎∎∎∎∎
282.465 [199] |∎∎
314.405 [51] |∎
346.346 [50] |∎
378.286 [18] |
410.227 [12] |
Latency distribution:
10 % in 126.87 ms
25 % in 135.98 ms
50 % in 181.42 ms
75 % in 199.73 ms
90 % in 220.07 ms
95 % in 236.44 ms
99 % in 297.01 ms
Status code distribution:
[OK] 10000 responses
Nameko, with a cached inspector as in this PR
╰─ ghz -n 10000 --insecure --proto ./example.proto --call nameko.example.unary_unary -d '{"value": "A"}' localhost:50051
Summary:
Count: 10000
Total: 7.54 s
Slowest: 67.94 ms
Fastest: 15.51 ms
Average: 37.41 ms
Requests/sec: 1326.62
Response time histogram:
15.506 [1] |
20.750 [12] |
25.993 [98] |∎
31.237 [81] |∎
36.480 [4513] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
41.724 [4449] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
46.967 [459] |∎∎∎∎
52.211 [280] |∎∎
57.454 [49] |
62.698 [0] |
67.941 [58] |∎
Latency distribution:
10 % in 34.13 ms
25 % in 35.41 ms
50 % in 36.66 ms
75 % in 38.55 ms
90 % in 41.04 ms
95 % in 46.55 ms
99 % in 52.53 ms
Status code distribution:
[OK] 10000 responses
I have used https://ghz.sh to assess the performance of nameko-grpc compared to the official gRPC implementation, and also inspected where the service was spending its time using https://github.com/benfred/py-spy.
The test is against a trivial unary request and response endpoint that does almost nothing in the service method. The work on the server is almost exclusively overhead of handling requests.
Results below, but TL;DR:
Two scripts used to run the services under test are included in this PR.
gRPC official client
Nameko, without changes in this PR:
Nameko, with a cached inspector as in this PR