ceph / ceph-csi

CSI driver for Ceph
Apache License 2.0
1.27k stars 542 forks source link

daemonset csi-rbdplugin pod pids.current value keep rising #4520

Open yingxin-gh opened 6 months ago

yingxin-gh commented 6 months ago

Describe the bug

During test, we create then delete many pvc and pods(about 180) , every pods will mount one pvc. We repeat the process about 5 hours. We find that the daemonset csi-rbdplugin pod pids.current value keep rising. During our test, the pids.current value is 47 before test, the value increased to 99 after stopping the test.

The ceph-csi version is v3.9.0 and has set pidlimit=-1.

Environment details

Steps to reproduce the behavior:

  1. create a script to create pod and pvc action=$1 for i in {1..180} do cat <<EOF | kubectl $action -f - apiVersion: v1 kind: PersistentVolumeClaim metadata: name: rbd-pvc-$i spec: accessModes:
    • ReadWriteOnce storageClassName: "network-block" resources: requests: storage: 100M

      apiVersion: v1 kind: Pod metadata: name: test-rbd-$i spec: containers:

    • name: www image: nginx:alpine ports:
      • containerPort: 80 name: www volumeMounts:
      • name: www-persistent-storage mountPath: /usr/share/nginx/html volumes:
    • name: www-persistent-storage persistentVolumeClaim: claimName: rbd-pvc-$i nodeName: node-xxx EOF done
  2. run the script to create and delete pod and pvc for about 5 hours
  3. Check pods pid.max before test: crictl exec -it 608df5831ddae cat /sys/fs/cgroup/pids/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-pode1aea94b_2820_418f_8b33_fd51e946a442.slice/cri-containerd-608df5831ddae1b508434f20edee9b18d58fb76771c5fc4c785f99b024dc56c5.scope/pids.current 47 after test: crictl exec -it 608df5831ddae cat /sys/fs/cgroup/pids/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-pode1aea94b_2820_418f_8b33_fd51e946a442.slice/cri-containerd-608df5831ddae1b508434f20edee9b18d58fb76771c5fc4c785f99b024dc56c5.scope/pids.current 99

Actual results

Describe what happened

Expected behavior

Some pids should be released.

Logs

If the issue is in PVC creation, deletion, cloning please attach complete logs of below containers.

If the issue is in PVC resize please attach complete logs of below containers.

If the issue is in snapshot creation and deletion please attach complete logs of below containers.

If the issue is in PVC mounting please attach complete logs of below containers.

Note:- If its a rbd issue please provide only rbd related logs, if its a cephFS issue please provide cephFS logs.

Additional context

Add any other context about the problem here.

For example:

Any existing bug report which describe about the similar issue/behavior

Madhu-1 commented 6 months ago

During test, we create then delete many pvc and pods(about 180) , every pods will mount one pvc. We repeat the process about 5 hours. We find that the daemonset csi-rbdplugin pod pids.current value keep rising. During our test, the pids.current value is 47 before test, the value increased to 99 after stopping the test.

@yingxin-gh After the test, you have deleted all the applications right? did you get a chance to see if any process is using those pid values?

yingxin-gh commented 6 months ago

Yes. I delete all the deployed pods and pvcs after test. There are no other workloads in the test node.
The information in the ceph-csi pods are as below.

#cat /sys/fs/cgroup/pids/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-pode1aea94b_2820_418f_8b33_fd51e946a442.slice/cri-containerd-608df5831ddae1b508434f20edee9b18d58fb76771c5fc4c785f99b024dc56c5.scope/pids.current 
100

#cat /sys/fs/cgroup/pids/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-pode1aea94b_2820_418f_8b33_fd51e946a442.slice/cri-containerd-608df5831ddae1b508434f20edee9b18d58fb76771c5fc4c785f99b024dc56c5.scope/tasks | xargs
2081 2855 4574 4596 5450 5451 5570 5593 5596 5598 5825 7607 7608 7610 10749 13324 13355 13356 13357 13358 13359 13360 13459 14101 14991 14993 14994 16190 16680 16681 16682 16683 16687 16688 16689 16690 16691 16692 16693 16694 16695 16696 16697 17468 17469 18455 19501 19531 19532 19533 19636 19640 19641 19642 19643 19644 19646 19647 19648 19652 19653 19656 20655 20657 20659 20660 20661 20662 20663 20664 20666 20668 20669 20670 20672 20674 20677 20679 20680 20707 22930 22931 23063 23067 23091 23579 23580 24161 24164 24259 27320 27321 27322 27456 27671 27677 28399 31043 31083 31233 32046

# pstree -p 13324
cephcsi(13324)─┬─{cephcsi}(13355)
               ├─{cephcsi}(13356)
               ├─{cephcsi}(13357)
               ├─{cephcsi}(13358)
               ├─{cephcsi}(13359)
               ...
# pstree -p 19656 -tasul
systemd,1 --switched-root --system --deserialize 28
  └─containerd-shim,12157 -namespace k8s.io -id 1c317195b5d998726a3b974a669168695b54412152d0225c476bbb3b3415f5ac -address /run/containerd/containerd.sock
      └─cephcsi,13324 --nodeid=node-xxx --pluginpath=/var/lib/kubelet/plugins --stagingpath=/var/lib/kubelet/plugins/kubernetes.io/csi/ --type=rbd --nodeserver=true --pidlimit=-1 --endpoint=unix:///csi/csi.sock --csi-addons-endpoint=unix:///csi/csi-addons.sock --v=5 --drivername=rbd.csi.ceph.com --enableprofiling=false
          └─{cephcsi},19656
# cd /proc/13324/task/19656
# ls
arch_status  children    cpuset   fail-nth  io        make-it-fail  mounts     oom_adj        patch_state  sched      smaps         statm    wchan attr         clear_refs  cwd      fd        latency   maps          net        oom_score      personality  schedstat  smaps_rollup  status auxv         cmdline     environ  fdinfo    limits    mem           ns         oom_score_adj  projid_map   sessionid  stack         syscall cgroup       comm        exe      gid_map   loginuid  mountinfo     numa_maps  pagemap        root         setgroups  stat          uid_map

# cat stat
19656 (cephcsi) S 12157 13324 13324 0 -1 1077936448 2 306388502 0 178881 0 0 391466 215491 20 0 94 0 483954069 3778420736 18646 18446744073709551615 4194304 57137288 140733108496528 0 0 0 0 0 2143420159 0 0 0 -1 1 0 0 0 0 0 59235744 59968600 90431488 140733108501966 140733108502299 140733108502299 140733108506593 0

# cat status
Name:   cephcsi
Umask:  0022
State:  S (sleeping)
Tgid:   13324
Ngid:   0
Pid:    19656
PPid:   12157
TracerPid:      0
Uid:    0       0       0       0
Gid:    0       0       0       0
FDSize: 1024
Groups: 0 
NStgid: 13324
NSpid:  19656
NSpgid: 13324
NSsid:  13324
VmPeak:  3689864 kB
VmSize:  3689864 kB
VmLck:         0 kB
VmPin:         0 kB
VmHWM:    132292 kB
VmRSS:     74584 kB
RssAnon:           74584 kB
RssFile:               0 kB
RssShmem:              0 kB
VmData:   898520 kB
VmStk:       728 kB
VmExe:     51704 kB
VmLib:     33324 kB
VmPTE:      1060 kB
VmSwap:        0 kB
HugetlbPages:          0 kB
CoreDumping:    0
THP_enabled:    1
Threads:        94
SigQ:   1/29886
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: 0000000000000000
SigCgt: fffffffdffc1feff
CapInh: 0000000000000000
CapPrm: 000000ffffffffff
CapEff: 000000ffffffffff
CapBnd: 000000ffffffffff
CapAmb: 0000000000000000
NoNewPrivs:     0
Seccomp:        0
Speculation_Store_Bypass:       thread vulnerable
Cpus_allowed:   f
Cpus_allowed_list:      0-3
Mems_allowed:   00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000001
Mems_allowed_list:      0
voluntary_ctxt_switches:        612
nonvoluntary_ctxt_switches:     48
Madhu-1 commented 6 months ago

@Rakshith-R can you please take a look at it?

nixpanic commented 6 months ago

Ceph-CSI supports golang profiling, see #1935 for details. Profiling can help pointing out what functions/routines are still running. Common causes include open connections to the Ceph cluster, go routines that were started on 1st use, but not exited once their work is done, ...

yingxin-gh commented 6 months ago

Enable the profiling and re-test. The pidx.current is increased to 63 and stops the test. The goroutine info is as below.

crictl exec -it 2dd7af7189eb4 cat /sys/fs/cgroup/pids/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-podb3ef3ee9_f571_4a8c_b34e_a8f73ac606be.slice/cri-containerd-2dd7af7189eb483309d95bdbb06b5113971a19a45be9d242d10828709197a162.scope/pids.current 
63
goroutine 20 [running]:
runtime/pprof.writeGoroutineStacks({0xb9b0c0, 0xc000234e00})
    /usr/local/go/src/runtime/pprof/pprof.go:703 +0x70
runtime/pprof.writeGoroutine({0xb9b0c0?, 0xc000234e00?}, 0x0?)
    /usr/local/go/src/runtime/pprof/pprof.go:692 +0x2b
runtime/pprof.(*Profile).WriteTo(0x825060?, {0xb9b0c0?, 0xc000234e00?}, 0xc?)
    /usr/local/go/src/runtime/pprof/pprof.go:329 +0x14b
net/http/pprof.handler.ServeHTTP({0xc00002c371, 0x9}, {0xb9d830, 0xc000234e00}, 0xd?)
    /usr/local/go/src/net/http/pprof/pprof.go:259 +0x4a5
net/http/pprof.Index({0xb9d830?, 0xc000234e00}, 0xc00022e600?)
    /usr/local/go/src/net/http/pprof/pprof.go:376 +0xf3
net/http.HandlerFunc.ServeHTTP(0xc000170000?, {0xb9d830?, 0xc000234e00?}, 0x40f188?)
    /usr/local/go/src/net/http/server.go:2122 +0x2f
net/http.(*ServeMux).ServeHTTP(0xc00002c37b?, {0xb9d830, 0xc000234e00}, 0xc00022e600)
    /usr/local/go/src/net/http/server.go:2500 +0x149
net/http.serverHandler.ServeHTTP({0xc0000240f0?}, {0xb9d830, 0xc000234e00}, 0xc00022e600)
    /usr/local/go/src/net/http/server.go:2936 +0x316
net/http.(*conn).serve(0xc000096000, {0xb9dc38, 0xc000215290})
    /usr/local/go/src/net/http/server.go:1995 +0x612
created by net/http.(*Server).Serve
    /usr/local/go/src/net/http/server.go:3089 +0x5ed

goroutine 1 [IO wait]:
internal/poll.runtime_pollWait(0x7efc00cadb88, 0x72)
    /usr/local/go/src/runtime/netpoll.go:306 +0x89
internal/poll.(*pollDesc).wait(0xc0001cc480?, 0x8?, 0x0)
    /usr/local/go/src/internal/poll/fd_poll_runtime.go:84 +0x32
internal/poll.(*pollDesc).waitRead(...)
    /usr/local/go/src/internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Accept(0xc0001cc480)
    /usr/local/go/src/internal/poll/fd_unix.go:614 +0x2bd
net.(*netFD).accept(0xc0001cc480)
    /usr/local/go/src/net/fd_unix.go:172 +0x35
net.(*TCPListener).accept(0xc00011c378)
    /usr/local/go/src/net/tcpsock_posix.go:148 +0x25
net.(*TCPListener).Accept(0xc00011c378)
    /usr/local/go/src/net/tcpsock.go:297 +0x3d
net/http.(*Server).Serve(0xc00022a000, {0xb9d650, 0xc00011c378})
    /usr/local/go/src/net/http/server.go:3059 +0x385
net/http.Serve(...)
    /usr/local/go/src/net/http/server.go:2581
main.main()
    /usr/local/go/src/cmd/trace/main.go:136 +0x56d

goroutine 50 [IO wait]:
internal/poll.runtime_pollWait(0x7efc00cad9a8, 0x72)
    /usr/local/go/src/runtime/netpoll.go:306 +0x89
internal/poll.(*pollDesc).wait(0xc00005c000?, 0xc0001d8000?, 0x0)
    /usr/local/go/src/internal/poll/fd_poll_runtime.go:84 +0x32
internal/poll.(*pollDesc).waitRead(...)
    /usr/local/go/src/internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Read(0xc00005c000, {0xc0001d8000, 0x1000, 0x1000})
    /usr/local/go/src/internal/poll/fd_unix.go:167 +0x299
net.(*netFD).Read(0xc00005c000, {0xc0001d8000?, 0x0?, 0xc0001c3618?})
    /usr/local/go/src/net/fd_posix.go:55 +0x29
net.(*conn).Read(0xc00008e000, {0xc0001d8000?, 0x0?, 0xc000113118?})
    /usr/local/go/src/net/net.go:183 +0x45
net/http.(*connReader).Read(0xc000113110, {0xc0001d8000, 0x1000, 0x1000})
    /usr/local/go/src/net/http/server.go:782 +0x171
bufio.(*Reader).fill(0xc0001a6000)
    /usr/local/go/src/bufio/bufio.go:106 +0xff
bufio.(*Reader).ReadSlice(0xc0001a6000, 0x0?)
    /usr/local/go/src/bufio/bufio.go:372 +0x2f
bufio.(*Reader).ReadLine(0xc0001a6000)
    /usr/local/go/src/bufio/bufio.go:401 +0x27
net/textproto.(*Reader).readLineSlice(0xc000113170)
    /usr/local/go/src/net/textproto/reader.go:56 +0x99
net/textproto.(*Reader).ReadLine(...)
    /usr/local/go/src/net/textproto/reader.go:39
net/http.readRequest(0x7efc2a778ac8?)
    /usr/local/go/src/net/http/request.go:1042 +0xba
net/http.(*conn).readRequest(0xc000090000, {0xb9db90, 0xc000196000})
    /usr/local/go/src/net/http/server.go:990 +0x24a
net/http.(*conn).serve(0xc000090000, {0xb9dc38, 0xc000215290})
    /usr/local/go/src/net/http/server.go:1920 +0x34b
created by net/http.(*Server).Serve
    /usr/local/go/src/net/http/server.go:3089 +0x5ed

goroutine 43 [IO wait]:
internal/poll.runtime_pollWait(0x7efc00cada98, 0x72)
    /usr/local/go/src/runtime/netpoll.go:306 +0x89
internal/poll.(*pollDesc).wait(0xc000170000?, 0xc000024101?, 0x0)
    /usr/local/go/src/internal/poll/fd_poll_runtime.go:84 +0x32
internal/poll.(*pollDesc).waitRead(...)
    /usr/local/go/src/internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Read(0xc000170000, {0xc000024101, 0x1, 0x1})
    /usr/local/go/src/internal/poll/fd_unix.go:167 +0x299
net.(*netFD).Read(0xc000170000, {0xc000024101?, 0x7face0?, 0xc000047f48?})
    /usr/local/go/src/net/fd_posix.go:55 +0x29
net.(*conn).Read(0xc00011e020, {0xc000024101?, 0xc0001d2110?, 0xc00007e5f0?})
    /usr/local/go/src/net/net.go:183 +0x45
net/http.(*connReader).backgroundRead(0xc0000240f0)
    /usr/local/go/src/net/http/server.go:674 +0x3f
created by net/http.(*connReader).startBackgroundRead
    /usr/local/go/src/net/http/server.go:670 +0xca
nixpanic commented 6 months ago

Can you share all of the goroutines? The ones that you listed here are pprof and http service related, these are expected to keep running until a request comes in.

Do you have goroutines that include source files with github.com/ceph/ceph-csi/ in the directory structure?

yingxin-gh commented 6 months ago

How to get all gorouting? I get them by below step, it is right?

go tool pprof http://xxx:8080/debug/pprof/goroutine?debug=1
Fetching profile over HTTP from http://xxx:8080/debug/pprof/goroutine?debug=1
Saved profile in /root/pprof/pprof.goroutine.001.pb.gz
Type: goroutine
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) goroutine
(pprof) top 55
Showing nodes accounting for 9, 100% of 9 total
      flat  flat%   sum%        cum   cum%
         8 88.89% 88.89%          8 88.89%  runtime.gopark
         1 11.11%   100%          1 11.11%  runtime.goroutineProfileWithLabels
         0     0%   100%          1 11.11%  bufio.(*Reader).Read
         0     0%   100%          1 11.11%  github.com/ceph/ceph-csi/internal/csi-addons/server.(*CSIAddonsServer).serve
         0     0%   100%          1 11.11%  github.com/ceph/ceph-csi/internal/csi-common.(*nonBlockingGRPCServer).Wait
         0     0%   100%          1 11.11%  github.com/ceph/ceph-csi/internal/csi-common.(*nonBlockingGRPCServer).serve
         0     0%   100%          1 11.11%  github.com/ceph/ceph-csi/internal/rbd/driver.(*Driver).Run
         0     0%   100%          1 11.11%  github.com/ceph/ceph-csi/internal/util.StartMetricsServer
         0     0%   100%          1 11.11%  golang.org/x/net/http2.(*Framer).ReadFrame
         0     0%   100%          1 11.11%  golang.org/x/net/http2.readFrameHeader
         0     0%   100%          2 22.22%  google.golang.org/grpc.(*Server).Serve
         0     0%   100%          1 11.11%  google.golang.org/grpc.(*Server).handleRawConn.func1
         0     0%   100%          1 11.11%  google.golang.org/grpc.(*Server).serveStreams
         0     0%   100%          1 11.11%  google.golang.org/grpc/internal/transport.(*controlBuffer).get
         0     0%   100%          1 11.11%  google.golang.org/grpc/internal/transport.(*http2Server).HandleStreams
         0     0%   100%          1 11.11%  google.golang.org/grpc/internal/transport.(*http2Server).keepalive
         0     0%   100%          1 11.11%  google.golang.org/grpc/internal/transport.(*loopyWriter).run
         0     0%   100%          1 11.11%  google.golang.org/grpc/internal/transport.NewServerTransport.func2
         0     0%   100%          3 33.33%  internal/poll.(*FD).Accept
         0     0%   100%          2 22.22%  internal/poll.(*FD).Read
         0     0%   100%          5 55.56%  internal/poll.(*pollDesc).wait
         0     0%   100%          5 55.56%  internal/poll.(*pollDesc).waitRead
         0     0%   100%          5 55.56%  internal/poll.runtime_pollWait
         0     0%   100%          1 11.11%  io.ReadAtLeast
         0     0%   100%          1 11.11%  io.ReadFull
         0     0%   100%          1 11.11%  main.main
         0     0%   100%          1 11.11%  net.(*TCPListener).Accept
         0     0%   100%          1 11.11%  net.(*TCPListener).accept
         0     0%   100%          2 22.22%  net.(*UnixListener).Accept
         0     0%   100%          2 22.22%  net.(*UnixListener).accept
         0     0%   100%          2 22.22%  net.(*conn).Read
         0     0%   100%          2 22.22%  net.(*netFD).Read
         0     0%   100%          3 33.33%  net.(*netFD).accept
         0     0%   100%          1 11.11%  net/http.(*ServeMux).ServeHTTP
         0     0%   100%          1 11.11%  net/http.(*Server).ListenAndServe
         0     0%   100%          1 11.11%  net/http.(*Server).Serve
         0     0%   100%          1 11.11%  net/http.(*conn).serve
         0     0%   100%          1 11.11%  net/http.(*connReader).backgroundRead
         0     0%   100%          1 11.11%  net/http.HandlerFunc.ServeHTTP
         0     0%   100%          1 11.11%  net/http.ListenAndServe
         0     0%   100%          1 11.11%  net/http.serverHandler.ServeHTTP
         0     0%   100%          1 11.11%  net/http/pprof.Index
         0     0%   100%          1 11.11%  net/http/pprof.handler.ServeHTTP
         0     0%   100%          9   100%  runtime.goexit
         0     0%   100%          1 11.11%  runtime.goparkunlock
         0     0%   100%          1 11.11%  runtime.main
         0     0%   100%          5 55.56%  runtime.netpollblock
         0     0%   100%          2 22.22%  runtime.selectgo
         0     0%   100%          1 11.11%  runtime.semacquire1
         0     0%   100%          1 11.11%  runtime/pprof.(*Profile).WriteTo
         0     0%   100%          1 11.11%  runtime/pprof.runtime_goroutineProfileWithLabels
         0     0%   100%          1 11.11%  runtime/pprof.writeGoroutine
         0     0%   100%          1 11.11%  runtime/pprof.writeRuntimeProfile
         0     0%   100%          1 11.11%  sync.(*WaitGroup).Wait
         0     0%   100%          1 11.11%  sync.runtime_Semacquire