Closed rancherbot closed 2 years ago
Moving this out of To Test
as there is more work to be done to enable pprof on RKE2. Currently seeing
root@ip-172-31-31-12:/home/ubuntu# curl --insecure https://localhost:6443/debug/pprof/profile
{
"kind": "Status",
"apiVersion": "v1",
"metadata": {},
"status": "Failure",
"message": "Unauthorized",
"reason": "Unauthorized",
"code": 401
}
Clearly RKE2 locks down the port more than K3s, so more work is needed.
This does work, you need to hit port 9345 and the default time is quite large at 30 seconds.
curl --insecure https://localhost:9345/debug/pprof/profile?seconds=10 > cpu.pprof
Cluster Configuration: 1 server Also confirmed this flag does not apply to agents
Config.yaml:
write-kubeconfig-mode: 644 enable-pprof: true
write-kubeconfig-mode: 644 Validated with:
RKE2 Version/Commit: commit 5453a333aabe0d4d0fb067823bc4f7970a45a24b 16 sudo curl -sfL https://get.rke2.io | INSTALL_RKE2_COMMIT=5453a333aabe0d4d0fb067823bc4f7970a45a24b sh -
Steps To Validate:
Setup config.yaml using configuration 1 Installed RKE2 using curl command Grabbed the profile by curling the correct endpoint: curl --insecure https://localhost:9345/debug/pprof/profile > profile.pprof Grabbed the trace output by curling the correct endpoint: curl --insecure https://localhost:9345/debug/pprof/trace > trace.out Access the trace output Access the profile Perform the same steps 1-6 using configuration 2 -- pprof should be disabled. Results:
When accessing the trace output (step 5 above): note that I am actually unable to access the endpoint here, and I don't have a browser available in my VM, so I'm not 100% sure this part is working as expected, but it seems like it's doing what is expected to me so I am considering it okay. $ go tool trace trace.out 2022/06/16 18:18:08 Parsing trace... 2022/06/16 18:18:08 Splitting trace... 2022/06/16 18:18:08 Opening browser. Trace viewer is listening on http://127.0.0.1:40013
When accessing the profile (step 6 above): There may be some utilities that fail due to expected utilities not being installed on the host. This is intentional and utilities should be installed by the operator as needed. $ go tool pprof profile.pprof File: rke2 Type: cpu Time: Aug 1, 2022 at 6:57pm (UTC) Duration: 30s, Total samples = 180ms ( 0.6%) Entering interactive mode (type "help" for commands, "o" for options) (pprof)
(pprof) top Showing nodes accounting for 160ms, 88.89% of 180ms total Showing top 10 nodes out of 54 flat flat% sum% cum cum% 60ms 33.33% 33.33% 60ms 33.33% runtime.epollwait 20ms 11.11% 44.44% 80ms 44.44% runtime.netpoll 10ms 5.56% 50.00% 10ms 5.56% runtime.adjusttimers 10ms 5.56% 55.56% 10ms 5.56% runtime.findObject 10ms 5.56% 61.11% 100ms 55.56% runtime.findrunnable 10ms 5.56% 66.67% 10ms 5.56% runtime.futex 10ms 5.56% 72.22% 10ms 5.56% runtime.getitab 10ms 5.56% 77.78% 10ms 5.56% runtime.newobject 10ms 5.56% 83.33% 10ms 5.56% runtime.pageIndexOf 10ms 5.56% 88.89% 120ms 66.67% runtime.park_m
(pprof) png failed to execute dot. Is Graphviz installed? Error: exec: "dot": executable file not found in $PATH
$ sudo apt update && sudo apt install graphviz
(pprof) png Generating report in profile001.png
When pprof is disabled, cannot access any outputs as expected: $ go tool pprof profile.pprof profile.pprof: parsing profile: unrecognized profile format failed to fetch any source profiles Additional context / logs:
When trying pprof on the endpoint directly, it fails as expected: Fetching profile over HTTP from https://localhost:6443/debug/pprof/profile cmdline2.pprof: parsing profile: unrecognized profile format failed to fetch any source profiles
Additional context / logs:
When trying pprof on the endpoint directly, it fails as expected: cmdline2.pprof: parsing profile: unrecognized profile format failed to fetch any source profiles
Help text is sufficient: rke2 server --help | grep -i pprof --enable-pprof (experimental) Enable pprof endpoint on supervisor port
This is a backport issue for https://github.com/rancher/rke2/issues/3055, automatically created via rancherbot by @rancher-max
Original issue description:
Pullthrough from https://github.com/k3s-io/k3s/issues/1635