ctrox / zeropod

pod that scales down to zero
Apache License 2.0
50 stars 6 forks source link

unable to detach no such process #29

Open DragonHunter274 opened 1 month ago

DragonHunter274 commented 1 month ago

After fixing the zeropod deployment I am facing this issue:

Error (compel/src/lib/infect.c:418): Unable to detach from 4050438: No such process
Error (criu/cr-dump.c:2098): Dumping FAILED.

The process exists though root 4050438 0.0 0.0 11416 8028 ? Ss 19:08 0:00 nginx: master process nginx -g daemon off;

ctrox commented 1 month ago

Looks like CRIU fails to checkpoint the container here. Can you post a few details about your system? OS, kernel version, container image and pod config used (is this just the nginx example?). Can you also try to increase the scaledown duration a bit, something like: zeropod.ctrox.dev/scaledown-duration: 30s. Just to give the pod a bit longer after startup before it tries to first checkpoint it.

DragonHunter274 commented 1 month ago

OS is Ubuntu 22.04.4 LTS in a proxmox LXC container kernel is 5.15.107-2-pve the pod is the unmodified nginx example

increasing the scaledown duration didn't help

DragonHunter274 commented 1 month ago

I found some more debug output I missed last time most notably (00.101208) Error (compel/src/lib/ptrace.c:27): suspending seccomp failed: Operation not permitted

criu ckeck returns Error (criu/config.c:1031): Invalid value for --network-lock: skip not sure if that's relevant

full debug output ``` time="2024-09-20T11:57:24.683527344Z" level=error msg="error checkpointing container: runc did not terminate successfully: exit status 1: criu failed: type NOTIFY errno 0 log file: /run/k3s/containerd/io.containerd.runtime.v2.task/k8s.io/5088608131f3d4f6880fdd600a6cd8971ac610bbe1bb2424f94da9f816763ad2/work/snapshots/work/dump.log" runtime=io.containerd.zeropod.v2 time="2024-09-20T11:57:24.683585813Z" level=error msg="dump.log: (00.000000) Parsing config file /etc/criu/default.conf (00.000000) Unable to get $HOME directory, local configuration file will not be used. (00.000022) Version: 3.19 (gitid v3.19) (00.000028) Running on kube-test Linux 5.15.107-2-pve #1 SMP PVE 5.15.107-2 (2023-05-10T09:10Z) x86_64 (00.000030) Would overwrite RPC settings with values from /etc/criu/runc.conf (00.000045) Loaded kdat cache from /run/criu.kdat (00.000057) Hugetlb size 2 Mb is supported but cannot get dev's number (00.000063) Hugetlb size 1024 Mb is supported but cannot get dev's number (00.000299) Will dump/restore TCP connections (00.000303) Will skip in-flight TCP connections (00.000305) Will drop all TCP connections on restore (00.000311) ======================================== (00.000322) Dumping processes (pid: 3755205 comm: nginx) (00.000324) ======================================== (00.000327) rlimit: RLIMIT_NOFILE unlimited for self (00.000334) Running pre-dump scripts (00.000336) \tRPC (00.000393) irmap: Searching irmap cache in work dir (00.000402) No irmap-cache image (00.000404) irmap: Searching irmap cache in parent (00.000408) No parent images directory provided (00.000410) irmap: No irmap cache (00.000415) cpu: x86_family 6 x86_vendor_id GenuineIntel x86_model_id Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz (00.000418) cpu: fpu: xfeatures_mask 0x15 xsave_size 1088 xsave_size_max 1088 xsaves_size 960 (00.000421) cpu: fpu: x87 floating point registers xstate_offsets 0 / 0 xstate_sizes 160 / 160 (00.000423) cpu: fpu: AVX registers xstate_offsets 576 / 576 xstate_sizes 256 / 256 (00.000425) cpu: fpu: MPX CSR xstate_offsets 1024 / 832 xstate_sizes 64 / 64 (00.000428) cpu: fpu:1 fxsr:1 xsave:1 xsaveopt:1 xsavec:1 xgetbv1:1 xsaves:1 (00.000497) cg-prop: Parsing controller \"cpu\" (00.000500) cg-prop: \tStrategy \"replace\" (00.000502) cg-prop: \tProperty \"cpu.shares\" (00.000504) cg-prop: \tProperty \"cpu.cfs_period_us\" (00.000506) cg-prop: \tProperty \"cpu.cfs_quota_us\" (00.000508) cg-prop: \tProperty \"cpu.rt_period_us\" (00.000510) cg-prop: \tProperty \"cpu.rt_runtime_us\" (00.000511) cg-prop: Parsing controller \"memory\" (00.000513) cg-prop: \tStrategy \"replace\" (00.000515) cg-prop: \tProperty \"memory.limit_in_bytes\" (00.000517) cg-prop: \tProperty \"memory.memsw.limit_in_bytes\" (00.000519) cg-prop: \tProperty \"memory.swappiness\" (00.000520) cg-prop: \tProperty \"memory.soft_limit_in_bytes\" (00.000522) cg-prop: \tProperty \"memory.move_charge_at_immigrate\" (00.000524) cg-prop: \tProperty \"memory.oom_control\" (00.000526) cg-prop: \tProperty \"memory.use_hierarchy\" (00.000527) cg-prop: \tProperty \"memory.kmem.limit_in_bytes\" (00.000529) cg-prop: \tProperty \"memory.kmem.tcp.limit_in_bytes\" (00.000531) cg-prop: Parsing controller \"cpuset\" (00.000533) cg-prop: \tStrategy \"replace\" (00.000534) cg-prop: \tProperty \"cpuset.cpus\" (00.000536) cg-prop: \tProperty \"cpuset.mems\" (00.000538) cg-prop: \tProperty \"cpuset.memory_migrate\" (00.000540) cg-prop: \tProperty \"cpuset.cpu_exclusive\" (00.000541) cg-prop: \tProperty \"cpuset.mem_exclusive\" (00.000543) cg-prop: \tProperty \"cpuset.mem_hardwall\" (00.000545) cg-prop: \tProperty \"cpuset.memory_spread_page\" (00.000547) cg-prop: \tProperty \"cpuset.memory_spread_slab\" (00.000548) cg-prop: \tProperty \"cpuset.sched_load_balance\" (00.000550) cg-prop: \tProperty \"cpuset.sched_relax_domain_level\" (00.000552) cg-prop: Parsing controller \"blkio\" (00.000554) cg-prop: \tStrategy \"replace\" (00.000556) cg-prop: \tProperty \"blkio.weight\" (00.000558) cg-prop: Parsing controller \"freezer\" (00.000559) cg-prop: \tStrategy \"replace\" (00.000561) cg-prop: Parsing controller \"perf_event\" (00.000563) cg-prop: \tStrategy \"replace\" (00.000565) cg-prop: Parsing controller \"net_cls\" (00.000567) cg-prop: \tStrategy \"replace\" (00.000568) cg-prop: \tProperty \"net_cls.classid\" (00.000570) cg-prop: Parsing controller \"net_prio\" (00.000572) cg-prop: \tStrategy \"replace\" (00.000574) cg-prop: \tProperty \"net_prio.ifpriomap\" (00.000576) cg-prop: Parsing controller \"pids\" (00.000577) cg-prop: \tStrategy \"replace\" (00.000579) cg-prop: \tProperty \"pids.max\" (00.000585) cg-prop: Parsing controller \"devices\" (00.000587) cg-prop: \tStrategy \"replace\" (00.000588) cg-prop: \tProperty \"devices.list\" (00.000605) Preparing image inventory (version 1) (00.000623) Add pid ns 1 pid 3757927 (00.000629) Add net ns 2 pid 3757927 (00.000634) Add ipc ns 3 pid 3757927 (00.000640) Add uts ns 4 pid 3757927 (00.000648) Add time ns 5 pid 3757927 (00.000659) Add mnt ns 6 pid 3757927 (00.000665) Add user ns 7 pid 3757927 (00.000670) Add cgroup ns 8 pid 3757927 (00.000672) cg: Dumping cgroups for thread 3757927 (00.000686) cg: `- New css ID 1 (00.000689) cg: `- [] -> [/system.slice/k3s.service] [0] (00.000691) cg: Set 1 is criu one (00.000704) Detected cgroup V2 freezer (00.000706) freezing processes: 100000 attempts with 100 ms steps (00.000719) cgroup.freeze=0 (00.000742) cgroup.freeze=1 (00.100851) cgroup.freeze=1 (00.100880) freezing processes: 1 attempts done (00.100909) SEIZE 3755205 (comm nginx): success (00.100922) SEIZE 3755249 (comm nginx): success (00.100934) SEIZE 3755251 (comm nginx): success (00.100943) SEIZE 3755252 (comm nginx): success (00.100954) SEIZE 3755253 (comm nginx): success (00.100965) SEIZE 3755254 (comm nginx): success (00.100975) SEIZE 3755255 (comm nginx): success (00.100986) SEIZE 3755256 (comm nginx): success (00.100997) SEIZE 3755257 (comm nginx): success (00.101008) SEIZE 3755258 (comm nginx): success (00.101017) SEIZE 3755259 (comm nginx): success (00.101028) SEIZE 3755260 (comm nginx): success (00.101037) SEIZE 3755261 (comm nginx): success (00.101208) Error (compel/src/lib/ptrace.c:27): suspending seccomp failed: Operation not permitted (00.101220) net: Unlock network (00.101223) Unfreezing tasks into 1 (00.101225) \tUnseizing 3755205 into 1 (00.101227) Error (compel/src/lib/infect.c:418): Unable to detach from 3755205: No such process (00.101234) Error (criu/cr-dump.c:2098): Dumping FAILED." runtime=io.containerd.zeropod.v2 ```
ctrox commented 1 month ago

Thanks for the full CRIU log, that helps a lot.

OS is Ubuntu 22.04.4 LTS in a proxmox LXC container

So the k3s node is running inside an LXC container? That might complicate things here but I'm not sure.

Did you by any chance configure a seccomp default profile that enables seccomp for all containers?

Regardless, can you try explicitly disabling seccomp for the nginx pod?

spec:
  template:
    spec:
      containers:
        - image: nginx
          name: nginx
          ports:
            - containerPort: 80
          # add this
          securityContext:
            seccompProfile:
              type: Unconfined

criu ckeck returns Error (criu/config.c:1031): Invalid value for --network-lock: skip not sure if that's relevant

That is probably just because you ran an older version of criu from the OS which does not know about this option yet. You can run the check with the criu binary that zeropod installs like this:

LD_LIBRARY_PATH=/opt/zeropod/lib/ /opt/zeropod/bin/criu check
DragonHunter274 commented 1 month ago

yes, the k3s node is running inside a lxc container

setting seccomp to unconfined still results in the same error message, criu check returns Looks good. EDIT: criu check --all returns

Error (criu/cr-check.c:759): couldn't suspend seccomp: Operation not permitted
Error (criu/cr-check.c:802): Dumping seccomp filters not supported: Permission denied
Error (criu/tun.c:66): tun: Can't check tun support: No such file or directory
Warn  (criu/cr-check.c:1346): Nftables based locking requires libnftables and set concatenations support
Looks good but some kernel features are missing
which, depending on your process tree, may cause
dump or restore failure.
ctrox commented 1 month ago

I'm pretty sure this is caused by LXC applying seccomp filters to all processes running within the container. CRIU does not have the ability to ignore seccomp filters during checkpoint/restore (https://github.com/checkpoint-restore/criu/issues/2143), so I'm afraid the only way to get zeropod running within that LXC container (barring other roadblocks) would be to simply disable seccomp. I have never really used/configured LXC before but it looks like disabling seccomp is not that straight-forward: https://lists.linuxcontainers.org/pipermail/lxc-users/2020-June/015265.html

DragonHunter274 commented 1 month ago

I tried disabling seccomp by providing an empty blacklist as above but it didn't change anything but I agree It's probably lxc soing something weird here