Open yingxin-gh opened 6 months ago
During test, we create then delete many pvc and pods(about 180) , every pods will mount one pvc. We repeat the process about 5 hours. We find that the daemonset csi-rbdplugin pod pids.current value keep rising. During our test, the pids.current value is 47 before test, the value increased to 99 after stopping the test.
@yingxin-gh After the test, you have deleted all the applications right? did you get a chance to see if any process is using those pid values?
Yes. I delete all the deployed pods and pvcs after test. There are no other workloads in the test node.
The information in the ceph-csi pods are as below.
#cat /sys/fs/cgroup/pids/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-pode1aea94b_2820_418f_8b33_fd51e946a442.slice/cri-containerd-608df5831ddae1b508434f20edee9b18d58fb76771c5fc4c785f99b024dc56c5.scope/pids.current
100
#cat /sys/fs/cgroup/pids/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-pode1aea94b_2820_418f_8b33_fd51e946a442.slice/cri-containerd-608df5831ddae1b508434f20edee9b18d58fb76771c5fc4c785f99b024dc56c5.scope/tasks | xargs
2081 2855 4574 4596 5450 5451 5570 5593 5596 5598 5825 7607 7608 7610 10749 13324 13355 13356 13357 13358 13359 13360 13459 14101 14991 14993 14994 16190 16680 16681 16682 16683 16687 16688 16689 16690 16691 16692 16693 16694 16695 16696 16697 17468 17469 18455 19501 19531 19532 19533 19636 19640 19641 19642 19643 19644 19646 19647 19648 19652 19653 19656 20655 20657 20659 20660 20661 20662 20663 20664 20666 20668 20669 20670 20672 20674 20677 20679 20680 20707 22930 22931 23063 23067 23091 23579 23580 24161 24164 24259 27320 27321 27322 27456 27671 27677 28399 31043 31083 31233 32046
# pstree -p 13324
cephcsi(13324)─┬─{cephcsi}(13355)
├─{cephcsi}(13356)
├─{cephcsi}(13357)
├─{cephcsi}(13358)
├─{cephcsi}(13359)
...
# pstree -p 19656 -tasul
systemd,1 --switched-root --system --deserialize 28
└─containerd-shim,12157 -namespace k8s.io -id 1c317195b5d998726a3b974a669168695b54412152d0225c476bbb3b3415f5ac -address /run/containerd/containerd.sock
└─cephcsi,13324 --nodeid=node-xxx --pluginpath=/var/lib/kubelet/plugins --stagingpath=/var/lib/kubelet/plugins/kubernetes.io/csi/ --type=rbd --nodeserver=true --pidlimit=-1 --endpoint=unix:///csi/csi.sock --csi-addons-endpoint=unix:///csi/csi-addons.sock --v=5 --drivername=rbd.csi.ceph.com --enableprofiling=false
└─{cephcsi},19656
# cd /proc/13324/task/19656
# ls
arch_status children cpuset fail-nth io make-it-fail mounts oom_adj patch_state sched smaps statm wchan attr clear_refs cwd fd latency maps net oom_score personality schedstat smaps_rollup status auxv cmdline environ fdinfo limits mem ns oom_score_adj projid_map sessionid stack syscall cgroup comm exe gid_map loginuid mountinfo numa_maps pagemap root setgroups stat uid_map
# cat stat
19656 (cephcsi) S 12157 13324 13324 0 -1 1077936448 2 306388502 0 178881 0 0 391466 215491 20 0 94 0 483954069 3778420736 18646 18446744073709551615 4194304 57137288 140733108496528 0 0 0 0 0 2143420159 0 0 0 -1 1 0 0 0 0 0 59235744 59968600 90431488 140733108501966 140733108502299 140733108502299 140733108506593 0
# cat status
Name: cephcsi
Umask: 0022
State: S (sleeping)
Tgid: 13324
Ngid: 0
Pid: 19656
PPid: 12157
TracerPid: 0
Uid: 0 0 0 0
Gid: 0 0 0 0
FDSize: 1024
Groups: 0
NStgid: 13324
NSpid: 19656
NSpgid: 13324
NSsid: 13324
VmPeak: 3689864 kB
VmSize: 3689864 kB
VmLck: 0 kB
VmPin: 0 kB
VmHWM: 132292 kB
VmRSS: 74584 kB
RssAnon: 74584 kB
RssFile: 0 kB
RssShmem: 0 kB
VmData: 898520 kB
VmStk: 728 kB
VmExe: 51704 kB
VmLib: 33324 kB
VmPTE: 1060 kB
VmSwap: 0 kB
HugetlbPages: 0 kB
CoreDumping: 0
THP_enabled: 1
Threads: 94
SigQ: 1/29886
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: 0000000000000000
SigCgt: fffffffdffc1feff
CapInh: 0000000000000000
CapPrm: 000000ffffffffff
CapEff: 000000ffffffffff
CapBnd: 000000ffffffffff
CapAmb: 0000000000000000
NoNewPrivs: 0
Seccomp: 0
Speculation_Store_Bypass: thread vulnerable
Cpus_allowed: f
Cpus_allowed_list: 0-3
Mems_allowed: 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000001
Mems_allowed_list: 0
voluntary_ctxt_switches: 612
nonvoluntary_ctxt_switches: 48
@Rakshith-R can you please take a look at it?
Ceph-CSI supports golang profiling, see #1935 for details. Profiling can help pointing out what functions/routines are still running. Common causes include open connections to the Ceph cluster, go routines that were started on 1st use, but not exited once their work is done, ...
Enable the profiling and re-test. The pidx.current is increased to 63 and stops the test. The goroutine info is as below.
crictl exec -it 2dd7af7189eb4 cat /sys/fs/cgroup/pids/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-podb3ef3ee9_f571_4a8c_b34e_a8f73ac606be.slice/cri-containerd-2dd7af7189eb483309d95bdbb06b5113971a19a45be9d242d10828709197a162.scope/pids.current
63
goroutine 20 [running]:
runtime/pprof.writeGoroutineStacks({0xb9b0c0, 0xc000234e00})
/usr/local/go/src/runtime/pprof/pprof.go:703 +0x70
runtime/pprof.writeGoroutine({0xb9b0c0?, 0xc000234e00?}, 0x0?)
/usr/local/go/src/runtime/pprof/pprof.go:692 +0x2b
runtime/pprof.(*Profile).WriteTo(0x825060?, {0xb9b0c0?, 0xc000234e00?}, 0xc?)
/usr/local/go/src/runtime/pprof/pprof.go:329 +0x14b
net/http/pprof.handler.ServeHTTP({0xc00002c371, 0x9}, {0xb9d830, 0xc000234e00}, 0xd?)
/usr/local/go/src/net/http/pprof/pprof.go:259 +0x4a5
net/http/pprof.Index({0xb9d830?, 0xc000234e00}, 0xc00022e600?)
/usr/local/go/src/net/http/pprof/pprof.go:376 +0xf3
net/http.HandlerFunc.ServeHTTP(0xc000170000?, {0xb9d830?, 0xc000234e00?}, 0x40f188?)
/usr/local/go/src/net/http/server.go:2122 +0x2f
net/http.(*ServeMux).ServeHTTP(0xc00002c37b?, {0xb9d830, 0xc000234e00}, 0xc00022e600)
/usr/local/go/src/net/http/server.go:2500 +0x149
net/http.serverHandler.ServeHTTP({0xc0000240f0?}, {0xb9d830, 0xc000234e00}, 0xc00022e600)
/usr/local/go/src/net/http/server.go:2936 +0x316
net/http.(*conn).serve(0xc000096000, {0xb9dc38, 0xc000215290})
/usr/local/go/src/net/http/server.go:1995 +0x612
created by net/http.(*Server).Serve
/usr/local/go/src/net/http/server.go:3089 +0x5ed
goroutine 1 [IO wait]:
internal/poll.runtime_pollWait(0x7efc00cadb88, 0x72)
/usr/local/go/src/runtime/netpoll.go:306 +0x89
internal/poll.(*pollDesc).wait(0xc0001cc480?, 0x8?, 0x0)
/usr/local/go/src/internal/poll/fd_poll_runtime.go:84 +0x32
internal/poll.(*pollDesc).waitRead(...)
/usr/local/go/src/internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Accept(0xc0001cc480)
/usr/local/go/src/internal/poll/fd_unix.go:614 +0x2bd
net.(*netFD).accept(0xc0001cc480)
/usr/local/go/src/net/fd_unix.go:172 +0x35
net.(*TCPListener).accept(0xc00011c378)
/usr/local/go/src/net/tcpsock_posix.go:148 +0x25
net.(*TCPListener).Accept(0xc00011c378)
/usr/local/go/src/net/tcpsock.go:297 +0x3d
net/http.(*Server).Serve(0xc00022a000, {0xb9d650, 0xc00011c378})
/usr/local/go/src/net/http/server.go:3059 +0x385
net/http.Serve(...)
/usr/local/go/src/net/http/server.go:2581
main.main()
/usr/local/go/src/cmd/trace/main.go:136 +0x56d
goroutine 50 [IO wait]:
internal/poll.runtime_pollWait(0x7efc00cad9a8, 0x72)
/usr/local/go/src/runtime/netpoll.go:306 +0x89
internal/poll.(*pollDesc).wait(0xc00005c000?, 0xc0001d8000?, 0x0)
/usr/local/go/src/internal/poll/fd_poll_runtime.go:84 +0x32
internal/poll.(*pollDesc).waitRead(...)
/usr/local/go/src/internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Read(0xc00005c000, {0xc0001d8000, 0x1000, 0x1000})
/usr/local/go/src/internal/poll/fd_unix.go:167 +0x299
net.(*netFD).Read(0xc00005c000, {0xc0001d8000?, 0x0?, 0xc0001c3618?})
/usr/local/go/src/net/fd_posix.go:55 +0x29
net.(*conn).Read(0xc00008e000, {0xc0001d8000?, 0x0?, 0xc000113118?})
/usr/local/go/src/net/net.go:183 +0x45
net/http.(*connReader).Read(0xc000113110, {0xc0001d8000, 0x1000, 0x1000})
/usr/local/go/src/net/http/server.go:782 +0x171
bufio.(*Reader).fill(0xc0001a6000)
/usr/local/go/src/bufio/bufio.go:106 +0xff
bufio.(*Reader).ReadSlice(0xc0001a6000, 0x0?)
/usr/local/go/src/bufio/bufio.go:372 +0x2f
bufio.(*Reader).ReadLine(0xc0001a6000)
/usr/local/go/src/bufio/bufio.go:401 +0x27
net/textproto.(*Reader).readLineSlice(0xc000113170)
/usr/local/go/src/net/textproto/reader.go:56 +0x99
net/textproto.(*Reader).ReadLine(...)
/usr/local/go/src/net/textproto/reader.go:39
net/http.readRequest(0x7efc2a778ac8?)
/usr/local/go/src/net/http/request.go:1042 +0xba
net/http.(*conn).readRequest(0xc000090000, {0xb9db90, 0xc000196000})
/usr/local/go/src/net/http/server.go:990 +0x24a
net/http.(*conn).serve(0xc000090000, {0xb9dc38, 0xc000215290})
/usr/local/go/src/net/http/server.go:1920 +0x34b
created by net/http.(*Server).Serve
/usr/local/go/src/net/http/server.go:3089 +0x5ed
goroutine 43 [IO wait]:
internal/poll.runtime_pollWait(0x7efc00cada98, 0x72)
/usr/local/go/src/runtime/netpoll.go:306 +0x89
internal/poll.(*pollDesc).wait(0xc000170000?, 0xc000024101?, 0x0)
/usr/local/go/src/internal/poll/fd_poll_runtime.go:84 +0x32
internal/poll.(*pollDesc).waitRead(...)
/usr/local/go/src/internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Read(0xc000170000, {0xc000024101, 0x1, 0x1})
/usr/local/go/src/internal/poll/fd_unix.go:167 +0x299
net.(*netFD).Read(0xc000170000, {0xc000024101?, 0x7face0?, 0xc000047f48?})
/usr/local/go/src/net/fd_posix.go:55 +0x29
net.(*conn).Read(0xc00011e020, {0xc000024101?, 0xc0001d2110?, 0xc00007e5f0?})
/usr/local/go/src/net/net.go:183 +0x45
net/http.(*connReader).backgroundRead(0xc0000240f0)
/usr/local/go/src/net/http/server.go:674 +0x3f
created by net/http.(*connReader).startBackgroundRead
/usr/local/go/src/net/http/server.go:670 +0xca
Can you share all of the goroutines? The ones that you listed here are pprof and http service related, these are expected to keep running until a request comes in.
Do you have goroutines that include source files with github.com/ceph/ceph-csi/
in the directory structure?
How to get all gorouting? I get them by below step, it is right?
go tool pprof http://xxx:8080/debug/pprof/goroutine?debug=1
Fetching profile over HTTP from http://xxx:8080/debug/pprof/goroutine?debug=1
Saved profile in /root/pprof/pprof.goroutine.001.pb.gz
Type: goroutine
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) goroutine
(pprof) top 55
Showing nodes accounting for 9, 100% of 9 total
flat flat% sum% cum cum%
8 88.89% 88.89% 8 88.89% runtime.gopark
1 11.11% 100% 1 11.11% runtime.goroutineProfileWithLabels
0 0% 100% 1 11.11% bufio.(*Reader).Read
0 0% 100% 1 11.11% github.com/ceph/ceph-csi/internal/csi-addons/server.(*CSIAddonsServer).serve
0 0% 100% 1 11.11% github.com/ceph/ceph-csi/internal/csi-common.(*nonBlockingGRPCServer).Wait
0 0% 100% 1 11.11% github.com/ceph/ceph-csi/internal/csi-common.(*nonBlockingGRPCServer).serve
0 0% 100% 1 11.11% github.com/ceph/ceph-csi/internal/rbd/driver.(*Driver).Run
0 0% 100% 1 11.11% github.com/ceph/ceph-csi/internal/util.StartMetricsServer
0 0% 100% 1 11.11% golang.org/x/net/http2.(*Framer).ReadFrame
0 0% 100% 1 11.11% golang.org/x/net/http2.readFrameHeader
0 0% 100% 2 22.22% google.golang.org/grpc.(*Server).Serve
0 0% 100% 1 11.11% google.golang.org/grpc.(*Server).handleRawConn.func1
0 0% 100% 1 11.11% google.golang.org/grpc.(*Server).serveStreams
0 0% 100% 1 11.11% google.golang.org/grpc/internal/transport.(*controlBuffer).get
0 0% 100% 1 11.11% google.golang.org/grpc/internal/transport.(*http2Server).HandleStreams
0 0% 100% 1 11.11% google.golang.org/grpc/internal/transport.(*http2Server).keepalive
0 0% 100% 1 11.11% google.golang.org/grpc/internal/transport.(*loopyWriter).run
0 0% 100% 1 11.11% google.golang.org/grpc/internal/transport.NewServerTransport.func2
0 0% 100% 3 33.33% internal/poll.(*FD).Accept
0 0% 100% 2 22.22% internal/poll.(*FD).Read
0 0% 100% 5 55.56% internal/poll.(*pollDesc).wait
0 0% 100% 5 55.56% internal/poll.(*pollDesc).waitRead
0 0% 100% 5 55.56% internal/poll.runtime_pollWait
0 0% 100% 1 11.11% io.ReadAtLeast
0 0% 100% 1 11.11% io.ReadFull
0 0% 100% 1 11.11% main.main
0 0% 100% 1 11.11% net.(*TCPListener).Accept
0 0% 100% 1 11.11% net.(*TCPListener).accept
0 0% 100% 2 22.22% net.(*UnixListener).Accept
0 0% 100% 2 22.22% net.(*UnixListener).accept
0 0% 100% 2 22.22% net.(*conn).Read
0 0% 100% 2 22.22% net.(*netFD).Read
0 0% 100% 3 33.33% net.(*netFD).accept
0 0% 100% 1 11.11% net/http.(*ServeMux).ServeHTTP
0 0% 100% 1 11.11% net/http.(*Server).ListenAndServe
0 0% 100% 1 11.11% net/http.(*Server).Serve
0 0% 100% 1 11.11% net/http.(*conn).serve
0 0% 100% 1 11.11% net/http.(*connReader).backgroundRead
0 0% 100% 1 11.11% net/http.HandlerFunc.ServeHTTP
0 0% 100% 1 11.11% net/http.ListenAndServe
0 0% 100% 1 11.11% net/http.serverHandler.ServeHTTP
0 0% 100% 1 11.11% net/http/pprof.Index
0 0% 100% 1 11.11% net/http/pprof.handler.ServeHTTP
0 0% 100% 9 100% runtime.goexit
0 0% 100% 1 11.11% runtime.goparkunlock
0 0% 100% 1 11.11% runtime.main
0 0% 100% 5 55.56% runtime.netpollblock
0 0% 100% 2 22.22% runtime.selectgo
0 0% 100% 1 11.11% runtime.semacquire1
0 0% 100% 1 11.11% runtime/pprof.(*Profile).WriteTo
0 0% 100% 1 11.11% runtime/pprof.runtime_goroutineProfileWithLabels
0 0% 100% 1 11.11% runtime/pprof.writeGoroutine
0 0% 100% 1 11.11% runtime/pprof.writeRuntimeProfile
0 0% 100% 1 11.11% sync.(*WaitGroup).Wait
0 0% 100% 1 11.11% sync.runtime_Semacquire
Describe the bug
During test, we create then delete many pvc and pods(about 180) , every pods will mount one pvc. We repeat the process about 5 hours. We find that the daemonset csi-rbdplugin pod pids.current value keep rising. During our test, the pids.current value is 47 before test, the value increased to 99 after stopping the test.
The ceph-csi version is v3.9.0 and has set pidlimit=-1.
Environment details
fuse
orkernel
. for rbd itskrbd
orrbd-nbd
) : krbdSteps to reproduce
Steps to reproduce the behavior:
ReadWriteOnce storageClassName: "network-block" resources: requests: storage: 100M
apiVersion: v1 kind: Pod metadata: name: test-rbd-$i spec: containers:
Actual results
Describe what happened
Expected behavior
Some pids should be released.
Logs
If the issue is in PVC creation, deletion, cloning please attach complete logs of below containers.
If the issue is in PVC resize please attach complete logs of below containers.
If the issue is in snapshot creation and deletion please attach complete logs of below containers.
If the issue is in PVC mounting please attach complete logs of below containers.
csi-rbdplugin/csi-cephfsplugin and driver-registrar container logs from plugin pod from the node where the mount is failing.
if required attach dmesg logs.
Note:- If its a rbd issue please provide only rbd related logs, if its a cephFS issue please provide cephFS logs.
Additional context
Add any other context about the problem here.
For example:
Any existing bug report which describe about the similar issue/behavior