Closed Bigwen-1 closed 1 year ago
IMO, grpc-transcode involves full protocol conversion and coding and decoding of the body, which can degrade performance. cc @spacewander @tokers Do you guys have some ideas?
I have a question: using grpc-transcode
means the request path is: HTTP(s) -> APISIX -> gRPC server, different from GRPC client directly requesting GRPC Server QPS:
First of all, protocol conversion will take some time which is not avoidable. There is a trade-off.
What we can do is enhance the implementation of gRPC Transcode and make it faster. I don't have any further insights about it, we may inspect it via a flame graph and see if we can optimize the hot code path. Or even, change the way we implement gRPC Transcode (e.g, using FFI).
I tried to analyze the transcoding performance with the flame graph, but perf always failed to obtain stack index. Open SVG display: ERROR: No Valid Input provided to flamegraph.pl. Can you give me a hand?
I operate as follows: root@xxali-ecs-xxx-xx-xx-xx:/data/server/FlameGraph/FlameGraph# ls aix-perf.pl files.pl record-test.sh stackcollapse-jstack.pl stackcollapse-vsprof.pl demos flamegraph.pl stackcollapse-aix.pl stackcollapse-ljp.awk stackcollapse-vtune.pl dev jmaps stackcollapse-bpftrace.pl stackcollapse-perf.pl stackcollapse-wcp.pl difffolded.pl perf.data stackcollapse-chrome-tracing.py stackcollapse-perf-sched.awk stackcollapse-xdebug.php docs perf.data.old stackcollapse-elfutils.pl stackcollapse.pl test example-dtrace-stacks.txt pkgsplit-perf.pl stackcollapse-gdb.pl stackcollapse-pmc.pl test.sh example-dtrace.svg pmCount.svg stackcollapse-go.pl stackcollapse-recursive.pl example-perf-stacks.txt.gz range-perf.pl stackcollapse-instruments.pl stackcollapse-sample.awk example-perf.svg README.md stackcollapse-java-exceptions.pl stackcollapse-stap.pl root@xxali-ecs-xxx-xx-xx-xx:/data/server/FlameGraph/FlameGraph# perf record -F 99 -a -g -p 10853 -- sleep 30 Warning: PID/TID switch overriding SYSTEM[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.012 MB perf.data ] root@xxali-ecs-xxx-xx-xx-xx:/data/server/FlameGraph/FlameGraph# perf script > out.perf root@xxali-ecs-xxx-xx-xx-xx:/data/server/FlameGraph/FlameGraph# ./stackcollapse-perf.pl out.perf > out.folded root@xxali-ecs-xxx-xx-xx-xx:/data/server/FlameGraph/FlameGraph# ./flamegraph.pl out.folded > pmCount.svg ERROR: No stack counts found
Here is the process for generating a flame map in CI: https://github.com/apache/apisix/blob/master/.github/workflows/performance.yml
Here is the corresponding script: https://github.com/apache/apisix/blob/master/ci/performance_test.sh
You can check against the script to see if you've missed some steps.
ERROR: No stack counts found
It's likely that your perf workers are not the ones executing the request, and you can limit the number of workers to 1.
change https://github.com/apache/apisix/blob/cffa4b69f9080707d100fbe856f251054cc8cba4/conf/config-default.yaml#L165 is config.yaml
Hi, I'm sorry to bother you again. I'm really sorry that the flame map is still not drawn. https://github.com/apache/apisix/blob/master/.github/workflows/performance.yml appears to be every time a commit a jobs, because do not know how to put the whole yml, So I split each step to execute it.
A total of 4 steps have been implemented, which are
Except for the third step, the other three provided errors are reported. The log is as follows. The generated flamegraph. SVG displays: ERROR: No Valid input provided to flamegraph.pl.
In addition, I performed GRPC-Transcode concurrent pressure test by myself: /usr/local/stapxx/samples/lj-lua-stacks.sxx --arg time=30 --skip-badvars -x $(pgrep -P $(cat logs/nginx.pid) -n -f worker) > /tmp/tmp.bt Response: /usr/bin/env: 'stap++' : No such file or directory
Here is the log information of four steps:
2.sudo ./ci/performance_test.sh install_wrk2
.....
AR libluajit.a
ar: u' modifier ignored since
D' is the default (see `U')
CC luajit.o
BUILDVM jit/vmdef.lua
LINK luajit
OK Successfully built LuaJIT
make[1]: Leaving directory '/data/server/wrk2/deps/luajit/src'
CC src/wrk.c
src/wrk.c: In function ‘response_complete’:
src/wrk.c:529:16: warning: format ‘%lld’ expects argument of type ‘long long int’, but argument 2 has type ‘int64_t {aka long int}’ [-Wformat=]
printf(" expected_latency_timing = %lld\n", expected_latency_timing);
^
src/wrk.c:530:16: warning: format ‘%lld’ expects argument of type ‘long long int’, but argument 2 has type ‘uint64_t {aka long unsigned int}’ [-Wformat=]
printf(" now = %lld\n", now);
^
src/wrk.c:531:16: warning: format ‘%lld’ expects argument of type ‘long long int’, but argument 2 has type ‘uint64_t {aka long unsigned int}’ [-Wformat=]
printf(" expected_latency_start = %lld\n", expected_latency_start);
^
src/wrk.c:532:16: warning: format ‘%lld’ expects argument of type ‘long long int’, but argument 2 has type ‘uint64_t {aka long unsigned int}’ [-Wformat=]
printf(" c->thread_start = %lld\n", c->thread_start);
^
src/wrk.c:533:16: warning: format ‘%lld’ expects argument of type ‘long long int’, but argument 2 has type ‘uint64_t {aka long unsigned int}’ [-Wformat=]
printf(" c->complete = %lld\n", c->complete);
^
src/wrk.c:535:16: warning: format ‘%lld’ expects argument of type ‘long long int’, but argument 2 has type ‘uint64_t {aka long unsigned int}’ [-Wformat=]
printf(" latest_should_send_time = %lld\n", c->latest_should_send_time);
^
src/wrk.c:536:16: warning: format ‘%lld’ expects argument of type ‘long long int’, but argument 2 has type ‘uint64_t {aka long unsigned int}’ [-Wformat=]
printf(" latest_expected_start = %lld\n", c->latest_expected_start);
^
src/wrk.c:537:16: warning: format ‘%lld’ expects argument of type ‘long long int’, but argument 2 has type ‘uint64_t {aka long unsigned int}’ [-Wformat=]
printf(" latest_connect = %lld\n", c->latest_connect);
^
src/wrk.c:538:16: warning: format ‘%lld’ expects argument of type ‘long long int’, but argument 2 has type ‘uint64_t {aka long unsigned int}’ [-Wformat=]
printf(" latest_write = %lld\n", c->latest_write);
^
src/wrk.c:542:16: warning: format ‘%lld’ expects argument of type ‘long long int’, but argument 2 has type ‘uint64_t {aka long unsigned int}’ [-Wformat=]
printf(" next expected_latency_start = %lld\n", expected_latency_start);
^
src/wrk.c: At top level:
src/wrk.c:834:13: warning: ‘print_stats_latency’ defined but not used [-Wunused-function]
static void print_stats_latency(stats *stats) {
^
CC src/net.c
CC src/ssl.c
CC src/aprintf.c
CC src/stats.c
CC src/script.c
CC src/units.c
CC src/ae.c
CC src/zmalloc.c
CC src/http_parser.c
CC src/tinymt64.c
CC src/hdr_histogram.c
LUAJIT src/wrk.lua
LINK wrk
sudo ./ci/performance_test.sh install_stap_tools success
./ci/performance_test.sh run_performance_test
/local/openresty-debug/bin:/usr/local/openresty-debug/luajit/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin /usr/local/stapxx/samples/lj-lua-stacks.sxx --arg time=30 --skip-badvars -x 9056 Found exact match for libluajit: /usr/local/openresty/luajit/lib/libluajit-5.1.so.2.1.0 WARNING: cannot find module /usr/local/openresty/nginx/sbin/nginx debuginfo: No DWARF information found [man warning::debuginfo] WARNING: cannot find module /usr/local/openresty/luajit/lib/libluajit-5.1.so.2.1.0 debuginfo: No DWARF information found [man warning::debuginfo] semantic error: type definition 'lua_State' not found in '/usr/local/openresty/luajit/lib/libluajit-5.1.so.2.1.0': operator '@cast' at stapxx-4Sjl3xCx/luajit.stp:174:12 source: return @cast(L, "lua_State", "/usr/local/openresty/luajit/lib/libluajit-5.1.so.2.1.0")->glref->ptr64 ^
semantic error: unable to find global 'ngx_cycle' in /usr/local/openresty/nginx/sbin/nginx: operator '@var' at stapxx-4Sjl3xCx/nginx.lua.stp:9:43 source: return ngx_cycle_get_module_main_conf(@var("ngx_cycle", "/usr/local/openresty/nginx/sbin/nginx"), ^
semantic error: unable to find global 'ngx_cycle' in /usr/local/openresty/nginx/sbin/nginx: operator '@var' at :9:43 source: return ngx_cycle_get_module_main_conf(@var("ngx_cycle", "/usr/local/openresty/nginx/sbin/nginx"), ^
semantic error: unable to find global 'ngx_cycle' in /usr/local/openresty/nginx/sbin/nginx: operator '@var' at :9:43 source: return ngx_cycle_get_module_main_conf(@var("ngx_cycle", "/usr/local/openresty/nginx/sbin/nginx"), ^
semantic error: unable to find global 'ngx_cycle' in /usr/local/openresty/nginx/sbin/nginx: operator '@var' at :9:43 source: return ngx_cycle_get_module_main_conf(@var("ngx_cycle", "/usr/local/openresty/nginx/sbin/nginx"), ^
semantic error: unable to find global 'ngx_cycle' in /usr/local/openresty/nginx/sbin/nginx: operator '@var' at :9:43 source: return ngx_cycle_get_module_main_conf(@var("ngx_cycle", "/usr/local/openresty/nginx/sbin/nginx"), ^
semantic error: unable to find global 'ngx_cycle' in /usr/local/openresty/nginx/sbin/nginx: operator '@var' at :9:43 source: return ngx_cycle_get_module_main_conf(@var("ngx_cycle", "/usr/local/openresty/nginx/sbin/nginx"), ^
semantic error: unable to find global 'ngx_cycle' in /usr/local/openresty/nginx/sbin/nginx: operator '@var' at :9:43 source: return ngx_cycle_get_module_main_conf(@var("ngx_cycle", "/usr/local/openresty/nginx/sbin/nginx"), ^
semantic error: unable to find global 'ngx_cycle' in /usr/local/openresty/nginx/sbin/nginx: operator '@var' at :9:43 source: return ngx_cycle_get_module_main_conf(@var("ngx_cycle", "/usr/local/openresty/nginx/sbin/nginx"), ^
semantic error: unable to find global 'ngx_cycle' in /usr/local/openresty/nginx/sbin/nginx: operator '@var' at :9:43 source: return ngx_cycle_get_module_main_conf(@var("ngx_cycle", "/usr/local/openresty/nginx/sbin/nginx"), ^
semantic error: unable to find global 'ngx_cycle' in /usr/local/openresty/nginx/sbin/nginx: operator '@var' at :9:43 source: return ngx_cycle_get_module_main_conf(@var("ngx_cycle", "/usr/local/openresty/nginx/sbin/nginx"), ^
semantic error: unable to find global 'ngx_cycle' in /usr/local/openresty/nginx/sbin/nginx: operator '@var' at :9:43 source: return ngx_cycle_get_module_main_conf(@var("ngx_cycle", "/usr/local/openresty/nginx/sbin/nginx"), ^
semantic error: unable to find global 'ngx_cycle' in /usr/local/openresty/nginx/sbin/nginx: operator '@var' at :9:43 source: return ngx_cycle_get_module_main_conf(@var("ngx_cycle", "/usr/local/openresty/nginx/sbin/nginx"), ^
semantic error: unable to find global 'ngx_cycle' in /usr/local/openresty/nginx/sbin/nginx: operator '@var' at :9:43 source: return ngx_cycle_get_module_main_conf(@var("ngx_cycle", "/usr/local/openresty/nginx/sbin/nginx"), ^
semantic error: unable to find global 'ngx_cycle' in /usr/local/openresty/nginx/sbin/nginx: operator '@var' at :9:43 source: return ngx_cycle_get_module_main_conf(@var("ngx_cycle", "/usr/local/openresty/nginx/sbin/nginx"), ^
semantic error: unable to find global 'ngx_cycle' in /usr/local/openresty/nginx/sbin/nginx: operator '@var' at :9:43 source: return ngx_cycle_get_module_main_conf(@var("ngx_cycle", "/usr/local/openresty/nginx/sbin/nginx"), ^
semantic error: unable to find global 'ngx_cycle' in /usr/local/openresty/nginx/sbin/nginx: operator '@var' at :9:43 source: return ngx_cycle_get_module_main_conf(@var("ngx_cycle", "/usr/local/openresty/nginx/sbin/nginx"), ^
semantic error: unable to find global 'ngx_cycle' in /usr/local/openresty/nginx/sbin/nginx: operator '@var' at :9:43 source: return ngx_cycle_get_module_main_conf(@var("ngx_cycle", "/usr/local/openresty/nginx/sbin/nginx"), ^
semantic error: unable to find global 'ngx_cycle' in /usr/local/openresty/nginx/sbin/nginx: operator '@var' at :9:43 source: return ngx_cycle_get_module_main_conf(@var("ngx_cycle", "/usr/local/openresty/nginx/sbin/nginx"), ^
semantic error: unable to find global 'ngx_cycle' in /usr/local/openresty/nginx/sbin/nginx: operator '@var' at :9:43 source: return ngx_cycle_get_module_main_conf(@var("ngx_cycle", "/usr/local/openresty/nginx/sbin/nginx"), ^
semantic error: unable to find global 'ngx_cycle' in /usr/local/openresty/nginx/sbin/nginx: operator '@var' at :9:43 source: return ngx_cycle_get_module_main_conf(@var("ngx_cycle", "/usr/local/openresty/nginx/sbin/nginx"), ^
semantic error: unable to find global 'ngx_cycle' in /usr/local/openresty/nginx/sbin/nginx: operator '@var' at :9:43 source: return ngx_cycle_get_module_main_conf(@var("ngx_cycle", "/usr/local/openresty/nginx/sbin/nginx"), ^
semantic error: unable to find global 'ngx_cycle' in /usr/local/openresty/nginx/sbin/nginx: operator '@var' at :9:43 source: return ngx_cycle_get_module_main_conf(@var("ngx_cycle", "/usr/local/openresty/nginx/sbin/nginx"), ^
semantic error: unable to find global 'ngx_cycle' in /usr/local/openresty/nginx/sbin/nginx: operator '@var' at :9:43 source: return ngx_cycle_get_module_main_conf(@var("ngx_cycle", "/usr/local/openresty/nginx/sbin/nginx"), ^
semantic error: unable to find global 'ngx_cycle' in /usr/local/openresty/nginx/sbin/nginx: operator '@var' at :9:43 source: return ngx_cycle_get_module_main_conf(@var("ngx_cycle", "/usr/local/openresty/nginx/sbin/nginx"), ^
semantic error: unable to find global 'ngx_cycle' in /usr/local/openresty/nginx/sbin/nginx: operator '@var' at :9:43 source: return ngx_cycle_get_module_main_conf(@var("ngx_cycle", "/usr/local/openresty/nginx/sbin/nginx"), ^
semantic error: unable to find global 'ngx_cycle' in /usr/local/openresty/nginx/sbin/nginx: operator '@var' at :9:43 source: return ngx_cycle_get_module_main_conf(@var("ngx_cycle", "/usr/local/openresty/nginx/sbin/nginx"), ^
semantic error: unable to find global 'ngx_cycle' in /usr/local/openresty/nginx/sbin/nginx: operator '@var' at :9:43 source: return ngx_cycle_get_module_main_conf(@var("ngx_cycle", "/usr/local/openresty/nginx/sbin/nginx"), ^
semantic error: unable to find global 'ngx_cycle' in /usr/local/openresty/nginx/sbin/nginx: operator '@var' at :9:43 source: return ngx_cycle_get_module_main_conf(@var("ngx_cycle", "/usr/local/openresty/nginx/sbin/nginx"), ^
semantic error: unresolved type : identifier 'gco' at stapxx-4Sjl3xCx/luajit.stp:461:5 source: gco = @cast(pt, "GCproto", "/usr/local/openresty/luajit/lib/libluajit-5.1.so.2.1.0")->chunkname->gcptr64 ^
Pass 2: analysis failed. [man error::pass2] Number of similar error messages suppressed: 136. Number of similar warning messages suppressed: 13104. Rerun with -v to see them.
It looks like you downloaded the wrong package type.
That script is for the CI environment and may not apply to your server environment.
You can refer to this script for the steps, but you need to ensure each step is successful.
ref: https://anjia0532.github.io/2017/09/12/stap/ https://moonbingbing.gitbooks.io/openresty-best-practices/content/flame_graph/install.html
This issue has been marked as stale due to 350 days of inactivity. It will be closed in 2 weeks if no further activity occurs. If this issue is still relevant, please simply write any comment. Even if closed, you can still revive the issue at any time or discuss it on the dev@apisix.apache.org list. Thank you for your contributions.
This issue has been closed due to lack of activity. If you think that is incorrect, or the issue requires additional review, you can revive the issue at any time.
Description
hi, We expect the production environment to use grpc-transcode plugin for pressure measurement. Compared with the original GRPC client directly requesting GRPC Server QPS, grpc-transcode decreases by more than 5 times. Can it be improved? Or can I do some link tracing analysis on Apisix grpc-transcode?
Pressure test conclusion:
Environment
apisix version
): 2.14.1uname -a
): 101-Ubuntu SMP Friopenresty -V
ornginx -V
): nginx version: openresty/1.21.4.1curl http://127.0.0.1:9090/v1/server_info
): 3.4.0luarocks --version
):