josepdcs / kubectl-prof

kubectl-prof is a kubectl plugin to profile applications on kubernetes with minimum overhead
Apache License 2.0
35 stars 3 forks source link

use async profiler generate raw for java always no such file or directory error #36

Closed fairchilddb closed 2 months ago

fairchilddb commented 4 months ago

./kubectl-prof quickstart-es-default-0 -n default --pgrep Elasticsearch -e cpu -l java -t 500s -o raw --log-level debug Default profiling tool async-profiler will be used ... ✔ Verifying target pod ... ✔ Launching profiler ... ✔ Profiling ... ✔ Error: open /tmp/agent-raw-2037586-1.txt: no such file or directory ❌

{"type":"log","data":{"time":"2024-04-25T08:39:05.102930398Z","level":"debug","msg":"{\"Duration\":500000000000,\"Interval\":500000000000,\"UID\":\"d4e71b1e-3537-402d-a4bd-57981c1aeb3e\",\"ContainerRuntime\":\"containerd\",\"ContainerRuntimePath\":\"/run/containerd\",\"ContainerID\":\"2661c207673062ac9b40389fb8d25fbc00f1e3a1cbb12f3157f37bc5ac5bad1c\",\"PodUID\":\"d37565e8-e463-4f9d-b43a-31f1fe68aaa2\",\"Language\":\"java\",\"Event\":\"cpu\",\"Compressor\":\"gzip\",\"Tool\":\"async-profiler\",\"OutputType\":\"raw\",\"FileName\":\"\",\"HeapDumpSplitInChunkSize\":\"\",\"PID\":\"\",\"Pgrep\":\"Elasticsearch\",\"AdditionalArguments\":null,\"Iteration\":0}"}} {"type":"progress","data":{"time":"2024-04-25T08:39:05.103293142Z","stage":"started"}} {"type":"log","data":{"time":"2024-04-25T08:39:05.103421331Z","level":"debug","msg":"The target filesystem is: /run/containerd/io.containerd.runtime.v2.task/k8s.io/2661c207673062ac9b40389fb8d25fbc00f1e3a1cbb12f3157f37bc5ac5bad1c/rootfs"}} {"type":"log","data":{"time":"2024-04-25T08:39:05.103716862Z","level":"debug","msg":"pgrep -P 2037504"}} {"type":"log","data":{"time":"2024-04-25T08:39:05.117411991Z","level":"debug","msg":"pgrep -P 2037516"}} {"type":"log","data":{"time":"2024-04-25T08:39:05.128687515Z","level":"debug","msg":"pgrep -P 2037586"}} {"type":"log","data":{"time":"2024-04-25T08:39:05.139730059Z","level":"debug","msg":"/app/get-ps-command.sh 2037586"}} {"type":"log","data":{"time":"2024-04-25T08:39:05.164718085Z","level":"debug","msg":"ps command output: /usr/share/elasticsearch/jdk/bin/java -Des.networkaddress.cache.ttl=60 -Des.networkaddress.cache.negative.ttl=10 -Djava.security.manager=allow -XX:+AlwaysPreTouch -Xss1m -Djava.awt.headless=true -Dfile.encoding=UTF-8 -Djna.nosys=true -XX:-OmitStackTraceInFastThrow -Dio.netty.noUnsafe=true -Dio.netty.noKeySetOptimization=true -Dio.netty.recycler.maxCapacityPerThread=0 -Dlog4j.shutdownHookEnabled=false -Dlog4j2.disable.jmx=true -Dlog4j2.formatMsgNoLookups=true -Djava.locale.providers=SPI,COMPAT --add-opens=java.base/java.io=org.elasticsearch.preallocate --enable-native-access=org.elasticsearch.nativeaccess -Des.cgroups.hierarchy.override=/ -XX:ReplayDataFile=logs/replay_pid%p.log -Des.distribution.type=docker -XX:+UseG1GC -Djava.io.tmpdir=/tmp/elasticsearch-9751218612527351347 --add-modules=jdk.incubator.vector -XX:+HeapDumpOnOutOfMemoryError -XX:+ExitOnOutOfMemoryError -XX:HeapDumpPath=data -XX:ErrorFile=logs/hs_err_pid%p.log -Xlog:gc*,gc+age=trace,safepoint:file=logs/gc.log:utctime,level,pid,tags:filecount=32,filesize=64m -Xms1024m -Xmx1024m -XX:MaxDirectMemorySize=536870912 -XX:G1HeapRegionSize=4m -XX:InitiatingHeapOccupancyPercent=30 -XX:G1ReservePercent=15 --module-path /usr/share/elasticsearch/lib --add-modules=jdk.net --add-modules=ALL-MODULE-PATH -m org.elasticsearch.server/org.elasticsearch.bootstrap.Elasticsearch"}} {"type":"log","data":{"time":"2024-04-25T08:39:05.164846621Z","level":"debug","msg":"The PIDs to be profiled: [2037586]"}} {"type":"log","data":{"time":"2024-04-25T08:39:05.164878472Z","level":"debug","msg":"cp -r /app/async-profiler /tmp"}} {"type":"log","data":{"time":"2024-04-25T08:39:05.167856531Z","level":"debug","msg":"/tmp/async-profiler/profiler.sh -o collapsed -d 500 -f /tmp/agent-raw-2037586-1.txt -e cpu --fdtransfer 2037586"}} {"type":"error","data":{"reason":"open /tmp/agent-raw-2037586-1.txt: no such file or directory"}} {"type":"log","data":{"time":"2024-04-25T08:47:25.837430888Z","level":"debug","msg":"Received signal: terminated"}} {"type":"log","data":{"time":"2024-04-25T08:47:25.83748281Z","level":"debug","msg":"/tmp/async-profiler/profiler.sh stop 2037586"}} {"type":"log","data":{"time":"2024-04-25T08:47:25.84230416Z","level":"debug","msg":"Profiling finished properly. Bye!"}} rpc error: code = NotFound desc = an error occurred when try to find container "87727459221fec3493c440103aab9f80390663a3cfc3672a770b5f64f1037aa9": not found

How can I debug this problem? Please help!

josepdcs commented 4 months ago

I think it's not able to find the correct PID based on the pattern 'Elasticsearch'. If you could pass the correct PID to profile, it should work for you. I understand it's a container with multiple processes...