Closed NoamNakash closed 3 weeks ago
could you use strace to find out what syscall is failing?
and there should not be any difference, the seccomp configuration is coming from the container engine, the OCI runtime (crun or runc) just apply it
Sure, Here is the strace
[pid 2029049] writev(1, [{iov_base="-rw-r--r-- 818/818 82 2"..., iov_len=69}, {iov_base="\n", iov_len=1}], 2) = 70
[pid 2029049] openat(4, "data/last_archive.log", O_WRONLY|O_CREAT|O_EXCL|O_NOCTTY|O_NONBLOCK|O_LARGEFILE|O_CLOEXEC, 0644) = 5
[pid 2029049] write(5, "/CASSANDRA_DD/cassandra/data/arc"..., 82) = 82
[pid 2029049] fstat(5, {st_mode=S_IFREG|0644, st_size=82, ...}) = 0
[pid 2029049] utimensat(5, NULL, [{tv_sec=1730906704, tv_nsec=342102079} /* 2024-11-06T10:25:04.342102079-0500 */, {tv_sec=1730869218, tv_nsec=0} /* 2024-11-06T00:00:18-0500 */], 0) = 0
[pid 2029049] close(5) = 0
[pid 2029049] writev(1, [{iov_base="-rw-r--r-- 818/818 0 2"..., iov_len=88}, {iov_base="\n", iov_len=1}], 2) = 89
[pid 2029049] openat(4, "data/MyCenter__1730869213546__SCHEMA.cql", O_WRONLY|O_CREAT|O_EXCL|O_NOCTTY|O_NONBLOCK|O_LARGEFILE|O_CLOEXEC, 0644) = 5
[pid 2029049] fstat(5, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
[pid 2029049] utimensat(5, NULL, [{tv_sec=1730906704, tv_nsec=342102079} /* 2024-11-06T10:25:04.342102079-0500 */, {tv_sec=1730869220, tv_nsec=0} /* 2024-11-06T00:00:20-0500 */], 0) = 0
[pid 2029049] close(5) = 0
[pid 2029049] close(3) = 0
[pid 2029049] wait4(222, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 222
[pid 2029049] munmap(0x7f7e2b51c000, 16384) = 0
[pid 2029049] utimensat(4, "data", [UTIME_OMIT, {tv_sec=1730869220, tv_nsec=0} /* 2024-11-06T00:00:20-0500 */], AT_SYMLINK_NOFOLLOW) = 0
[pid 2029049] syscall_0x1c4(0x4, 0x7f7e2b528b40, 0x5fd, 0x100, 0x100, 0x7f7e2b528b40) = -1 EPERM (Operation not permitted)
[pid 2029049] fcntl(1, F_GETFL) = 0x8002 (flags O_RDWR|O_LARGEFILE)
[pid 2029049] writev(2, [{iov_base="tar: ", iov_len=5}, {iov_base=NULL, iov_len=0}], 2) = 5
[pid 2029049] writev(2, [{iov_base="data: Cannot change mode to rwxr"..., iov_len=37}, {iov_base=NULL, iov_len=0}], 2) = 37
[pid 2029049] writev(2, [{iov_base=": Operation not permitted", iov_len=25}, {iov_base=NULL, iov_len=0}], 2) = 25
[pid 2029049] writev(2, [{iov_base="", iov_len=0}, {iov_base="\n", iov_len=1}], 2) = 1
[pid 2029049] fcntl(1, F_GETFL) = 0x8002 (flags O_RDWR|O_LARGEFILE)
[pid 2029049] writev(2, [{iov_base="tar: ", iov_len=5}, {iov_base=NULL, iov_len=0}], 2) = 5
[pid 2029049] writev(2, [{iov_base="Exiting with failure status due "..., iov_len=50}, {iov_base=NULL, iov_len=0}], 2) = 50
[pid 2029049] writev(2, [{iov_base="", iov_len=0}, {iov_base="\n", iov_len=1}], 2) = 1
[pid 2029049] close(1) = 0
[pid 2029049] close(2) = 0
[pid 2029049] exit_group(2) = ?
The error seems to be
[pid 2029049] syscall_0x1c4(0x4, 0x7f7e2b528b40, 0x5fd, 0x100, 0x100, 0x7f7e2b528b40) = -1 EPERM (Operation not permitted)
This is a read syscall
We did see it has the same seccomp using crictl inspect, but for some reason, this code will get this error if we don't change the seccompprofile type, unlike the behavior we are having until now
Hello @giuseppe, Thank you for checking the issue!
I think crun mounts the volume slightly differently than runc and it interacts with tar in a way that our general backup-restore procedures are just not working anymore. (This is probably true for the rootfs too, since even the prompt is different in the two containers which is quite amazing for me. This is not a problem, but we are speaking about two containers with the same image and config the only difference is the runtime binary, anyway, let's move on.)
I also managed to reproduce this with an empty directory and made an strace about the taring and untaring processes for both the runc and the crun cases. Please find the outputs below:
First the setup:
uname -a
Linux gate-fi607-03-controller-01.tesla.com 5.14.0-427.33.1.el9_4.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Aug 16 10:56:24 EDT 2024 x86_64 x86_64 x86_64 GNU/Linux
cat /etc/os-release
NAME="Red Hat Enterprise Linux"
VERSION="9.4 (Plow)"
ID="rhel"
ID_LIKE="fedora"
VERSION_ID="9.4"
PLATFORM_ID="platform:el9"
PRETTY_NAME="Red Hat Enterprise Linux 9.4 (Plow)"
ANSI_COLOR="0;31"
LOGO="fedora-logo-icon"
CPE_NAME="cpe:/o:redhat:enterprise_linux:9::baseos"
HOME_URL="https://www.redhat.com/"
DOCUMENTATION_URL="https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 9"
REDHAT_BUGZILLA_PRODUCT_VERSION=9.4
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="9.4"
mount | grep cgroup
cgroup2 on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime,seclabel)
crun --version
crun version 1.14.3
commit: 1961d211ba98f532ea52d2e80f4c20359f241a98
rundir: /run/crun
spec: 1.0.0
+SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
runc version 1.1.4
commit: v1.1.4-0-g5fd4c4d
spec: 1.0.2-dev
go: go1.18.7
libseccomp: 2.5.2
The container running in ccas1 namespace uses crun, the other one in ccas2 namespace uses runc.
Information from the container in the crun - ccas1 case: The whole output of the two containers:
kubectl exec -it -n ccas1 ccas-apache-0 -c cbura-sidecar -- sh
/ $
/ $ tar --version
tar (GNU tar) 1.35
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <https://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Written by John Gilmore and Jay Fenlason.
/ $ pwd
/
/ $ ls -la
total 76
drwxr-xr-x 1 root root 4096 Nov 6 12:54 .
drwxr-xr-x 1 root root 4096 Nov 6 12:54 ..
drwxr-xr-x 1 root root 4096 Aug 9 21:52 bin
drwxrwsr-x 2 root 818 4096 Nov 7 09:20 data
drwxr-xr-x 5 root root 360 Nov 6 12:54 dev
drwxr-xr-x 1 root root 4096 Nov 6 12:54 etc
drwxr-xr-x 1 root root 4096 Aug 9 21:52 home
drwxr-xr-x 1 root root 4096 Aug 9 21:52 lib
drwxr-xr-x 5 root root 4096 Jul 22 14:34 media
drwxr-xr-x 2 root root 4096 Jul 22 14:34 mnt
drwxr-xr-x 2 root root 4096 Jul 22 14:34 opt
dr-xr-xr-x 709 root root 0 Nov 6 12:54 proc
drwx------ 2 root root 4096 Jul 22 14:34 root
drwxr-xr-x 2 root root 4096 Jul 22 14:34 run
drwxr-xr-x 2 root root 4096 Jul 22 14:34 sbin
drwxr-xr-x 2 root root 4096 Jul 22 14:34 srv
dr-xr-xr-x 13 root root 0 Nov 3 11:40 sys
drwxrwsrwx 3 root 818 71 Nov 7 11:10 tmp
drwxr-xr-x 1 root root 4096 Aug 9 21:52 usr
drwxr-xr-x 1 root root 4096 Jul 22 14:34 var
/ $ id
uid=818 gid=818 groups=818
/ $ cd /tmp/
/tmp $ ls -lah /data
total 12K
drwxrwsr-x 2 root 818 4.0K Nov 7 09:20 .
drwxr-xr-x 1 root root 4.0K Nov 6 12:54 ..
/tmp $ mount
overlay on / type overlay (ro,context="system_u:object_r:container_file_t:s0:c841,c892",relatime,lowerdir=/data0/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/234/fs:/data0/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/233/fs,upperdir=/data0/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/385/fs,workdir=/data0/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/385/work)
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
tmpfs on /dev type tmpfs (rw,context="system_u:object_r:container_file_t:s0:c841,c892",nosuid,size=65536k,mode=755,inode64)
devpts on /dev/pts type devpts (rw,context="system_u:object_r:container_file_t:s0:c841,c892",nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=666)
mqueue on /dev/mqueue type mqueue (rw,seclabel,nosuid,nodev,noexec,relatime)
sysfs on /sys type sysfs (ro,seclabel,nosuid,nodev,noexec,relatime)
cgroup2 on /sys/fs/cgroup type cgroup2 (ro,seclabel,nosuid,nodev,noexec,relatime)
/dev/vdn on /data type ext4 (rw,seclabel,relatime)
/dev/vda1 on /tmp type xfs (rw,seclabel,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota)
/dev/vdb on /etc/resolv.conf type ext4 (ro,seclabel,relatime)
/dev/vda1 on /etc/hosts type xfs (rw,seclabel,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota)
/dev/vda1 on /dev/termination-log type xfs (rw,seclabel,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota)
/dev/vdb on /etc/hostname type ext4 (ro,seclabel,relatime)
shm on /dev/shm type tmpfs (rw,seclabel,relatime,size=65536k,inode64)
/dev/vda1 on /etc/unified-logging/cpp-api type xfs (ro,seclabel,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota)
tmpfs on /proc/acpi type tmpfs (ro,context="system_u:object_r:container_file_t:s0:c841,c892",relatime,size=0k,inode64)
devtmpfs on /proc/kcore type devtmpfs (ro,seclabel,size=4096k,nr_inodes=4099235,mode=755,inode64)
devtmpfs on /proc/keys type devtmpfs (ro,seclabel,size=4096k,nr_inodes=4099235,mode=755,inode64)
devtmpfs on /proc/timer_list type devtmpfs (ro,seclabel,size=4096k,nr_inodes=4099235,mode=755,inode64)
tmpfs on /proc/scsi type tmpfs (ro,context="system_u:object_r:container_file_t:s0:c841,c892",relatime,size=0k,inode64)
tmpfs on /sys/firmware type tmpfs (ro,context="system_u:object_r:container_file_t:s0:c841,c892",relatime,size=0k,inode64)
proc on /proc/bus type proc (ro,nosuid,nodev,noexec,relatime)
proc on /proc/fs type proc (ro,nosuid,nodev,noexec,relatime)
proc on /proc/irq type proc (ro,nosuid,nodev,noexec,relatime)
proc on /proc/sys type proc (ro,nosuid,nodev,noexec,relatime)
proc on /proc/sysrq-trigger type proc (ro,nosuid,nodev,noexec,relatime)
Information from the container in the runc - ccas2 case:
kubectl exec -it -n ccas2 ccas-apache-0 -c cbura-sidecar -- sh
[root@gate-fi607-03-controller-01 cloud-admin]# kubectl exec -it -n ccas2 ccas-apache-0 -c cbura-sidecar -- sh
~ $
~ $ tar --version
tar (GNU tar) 1.35
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <https://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Written by John Gilmore and Jay Fenlason.
~ $ ls -la
total 76
drwxr-xr-x 1 root root 4096 Nov 5 12:29 .
drwxr-xr-x 1 root root 4096 Nov 5 12:29 ..
drwxr-xr-x 1 root root 4096 Aug 9 21:52 bin
drwxrwsr-x 2 root 818 4096 Nov 7 09:19 data
drwxr-xr-x 5 root root 360 Nov 5 12:29 dev
drwxr-xr-x 1 root root 4096 Nov 5 12:29 etc
drwxr-xr-x 1 root root 4096 Aug 9 21:52 home
drwxr-xr-x 1 root root 4096 Aug 9 21:52 lib
drwxr-xr-x 5 root root 4096 Jul 22 14:34 media
drwxr-xr-x 2 root root 4096 Jul 22 14:34 mnt
drwxr-xr-x 2 root root 4096 Jul 22 14:34 opt
dr-xr-xr-x 713 root root 0 Nov 5 12:29 proc
drwx------ 2 root root 4096 Jul 22 14:34 root
drwxr-xr-x 2 root root 4096 Jul 22 14:34 run
drwxr-xr-x 2 root root 4096 Jul 22 14:34 sbin
drwxr-xr-x 2 root root 4096 Jul 22 14:34 srv
dr-xr-xr-x 13 root root 0 Nov 3 11:40 sys
drwxrwsrwx 3 root 818 71 Nov 7 11:09 tmp
drwxr-xr-x 1 root root 4096 Aug 9 21:52 usr
drwxr-xr-x 1 root root 4096 Jul 22 14:34 var
~ $ id
uid=818 gid=818 groups=818
~ $
/ $ cd /tmp/
/tmp $ ls -lah /data
total 12K
drwxrwsr-x 2 root 818 4.0K Nov 7 09:20 .
drwxr-xr-x 1 root root 4.0K Nov 6 12:54 ..
/tmp $ mount
overlay on / type overlay (ro,context="system_u:object_r:container_file_t:s0:c255,c548",relatime,lowerdir=/data0/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/234/fs:/data0/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/233/fs,upperdir=/data0/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/358/fs,workdir=/data0/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/358/work)
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
tmpfs on /dev type tmpfs (rw,context="system_u:object_r:container_file_t:s0:c255,c548",nosuid,size=65536k,mode=755,inode64)
devpts on /dev/pts type devpts (rw,context="system_u:object_r:container_file_t:s0:c255,c548",nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=666)
mqueue on /dev/mqueue type mqueue (rw,seclabel,nosuid,nodev,noexec,relatime)
sysfs on /sys type sysfs (ro,seclabel,nosuid,nodev,noexec,relatime)
cgroup on /sys/fs/cgroup type cgroup2 (ro,seclabel,nosuid,nodev,noexec,relatime)
/dev/vdm on /data type ext4 (rw,seclabel,relatime)
/dev/vda1 on /tmp type xfs (rw,seclabel,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota)
/dev/vdb on /etc/resolv.conf type ext4 (ro,seclabel,relatime)
/dev/vda1 on /etc/hosts type xfs (rw,seclabel,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota)
/dev/vda1 on /dev/termination-log type xfs (rw,seclabel,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota)
/dev/vdb on /etc/hostname type ext4 (ro,seclabel,relatime)
shm on /dev/shm type tmpfs (rw,seclabel,nosuid,nodev,noexec,relatime,size=65536k,inode64)
/dev/vda1 on /etc/unified-logging/cpp-api type xfs (ro,seclabel,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota)
proc on /proc/bus type proc (ro,nosuid,nodev,noexec,relatime)
proc on /proc/fs type proc (ro,nosuid,nodev,noexec,relatime)
proc on /proc/irq type proc (ro,nosuid,nodev,noexec,relatime)
proc on /proc/sys type proc (ro,nosuid,nodev,noexec,relatime)
proc on /proc/sysrq-trigger type proc (ro,nosuid,nodev,noexec,relatime)
tmpfs on /proc/acpi type tmpfs (ro,context="system_u:object_r:container_file_t:s0:c255,c548",relatime,inode64)
tmpfs on /proc/kcore type tmpfs (rw,context="system_u:object_r:container_file_t:s0:c255,c548",nosuid,size=65536k,mode=755,inode64)
tmpfs on /proc/keys type tmpfs (rw,context="system_u:object_r:container_file_t:s0:c255,c548",nosuid,size=65536k,mode=755,inode64)
tmpfs on /proc/timer_list type tmpfs (rw,context="system_u:object_r:container_file_t:s0:c255,c548",nosuid,size=65536k,mode=755,inode64)
tmpfs on /proc/scsi type tmpfs (ro,context="system_u:object_r:container_file_t:s0:c255,c548",relatime,inode64)
tmpfs on /sys/firmware type tmpfs (ro,context="system_u:object_r:container_file_t:s0:c255,c548",relatime,inode64)
The interesting directory is /data with the permissions:
drwxrwsr-x 2 root 818 4096 Nov 7 09:19 data
It's empty for both of the tests.
The tar and untar output in the test cases: crun:
/tmp $ tar -p --use-compress-program="gzip -6" -cvf example-backup.tar.gz --exclude=/data/lost+found /data
tar: Removing leading `/' from member names
/data/
/tmp $ mkdir untar_dir
/tmp $ ls -la
total 16
drwxrwsrwx 4 root 818 64 Nov 8 08:45 .
drwxr-xr-x 1 root root 4096 Nov 6 12:54 ..
-rw-r--r-- 1 818 818 110 Nov 7 13:00 example-backup.tar.gz
drwxr-sr-x 3 818 818 18 Nov 7 13:17 untar_dir
/tmp $ tar -p --use-compress-program="gzip -d" -xvf example-backup.tar.gz -C untar_dir
data/
tar: data: Cannot change mode to rwxrwsr-x: Operation not permitted
tar: Exiting with failure status due to previous errors
/tmp $ cd untar_dir
/tmp/untar_dir $ ls -la
total 0
drwxr-sr-x 3 818 818 18 Nov 7 13:17 .
drwxrwsrwx 4 root 818 64 Nov 7 13:00 ..
drwx--S--- 2 818 818 6 Nov 7 09:20 data
runc:
/tmp $ ls -la
total 12
drwxrwsrwx 4 root 818 64 Nov 7 13:07 .
drwxr-xr-x 1 root root 4096 Nov 5 12:29 ..
drwxrwsr-x 3 818 818 142 Nov 7 12:24 cbur
-rw-r--r-- 1 818 818 110 Nov 7 13:07 example-backup.tar.gz
drwxr-sr-x 2 818 818 6 Nov 7 12:51 untar_dir
/tmp $ tar -p --use-compress-program="gzip -d" -xvf example-backup.tar.gz -C untar_dir
data/
/tmp $ cd untar_dir
/tmp/untar_dir $ ls -lah
total 0
drwxr-sr-x 3 818 818 18 Nov 7 13:13 .
drwxrwsrwx 4 root 818 64 Nov 7 13:07 ..
drwxrwsr-x 2 818 818 6 Nov 7 09:19 data
/tmp/untar_dir $
The output directory's permissions became
drwx--S--- 2 818 818 6 Nov 7 09:20 data
for the crun case and the tar command failed. Probably tar compressed the folder with bad permissions in the first place. What should I check next? Do you have any suggestions?
Attached strace outputs: crun-tar-output.txt crun-untar-output.txt runc-tar-output.txt runc-untar-output.txt
Oh I realized that the strace output is well speaking.
This syscall 0x1c4 gives different outputs for the two container:
The strace output for the runc case:
791322 mkdirat(4, "data", 0700) = 0
791322 close(3) = 0
791322 wait4(293, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 293
791322 munmap(0x7f0e693db000, 16384) = 0
791322 utimensat(4, "data", [UTIME_OMIT, {tv_sec=1730971192, tv_nsec=0} /* 2024-11-07T04:19:52-0500 */], AT_SYMLINK_NOFOLLOW) = 0
791322 syscall_0x1c4(0x4, 0x7f0e693e7b40, 0x5fd, 0x100, 0x100, 0x7f0e693e7b40) = -1 ENOSYS (Function not implemented)
791322 newfstatat(4, "data", {st_mode=S_IFDIR|S_ISGID|0700, st_size=6, ...}, AT_SYMLINK_NOFOLLOW) = 0
791322 openat(4, "data", O_RDONLY|O_NOCTTY|O_NOFOLLOW|O_CLOEXEC|O_PATH) = 3
791322 stat("/proc/self/fd/3", {st_mode=S_IFDIR|S_ISGID|0700, st_size=6, ...}) = 0
791322 fchmodat(AT_FDCWD, "/proc/self/fd/3", 02775) = 0
791322 close(3) = 0
791322 close(1) = 0
791322 close(2) = 0
791322 exit_group(0) = ?
The crun one:
801373 mkdirat(4, "data", 0700) = 0
801373 close(3) = 0
801373 wait4(342, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 342
801373 munmap(0x7f2b4fc9b000, 16384) = 0
801373 utimensat(4, "data", [UTIME_OMIT, {tv_sec=1730971240, tv_nsec=0} /* 2024-11-07T04:20:40-0500 */], AT_SYMLINK_NOFOLLOW) = 0
801373 syscall_0x1c4(0x4, 0x7f2b4fca7b40, 0x5fd, 0x100, 0x100, 0x7f2b4fca7b40) = -1 EPERM (Operation not permitted)
801373 fcntl(1, F_GETFL) = 0x8002 (flags O_RDWR|O_LARGEFILE)
801373 writev(2, [{iov_base="tar: ", iov_len=5}, {iov_base=NULL, iov_len=0}], 2) = 5
801373 writev(2, [{iov_base="data: Cannot change mode to rwxr"..., iov_len=37}, {iov_base=NULL, iov_len=0}], 2) = 37
801373 writev(2, [{iov_base=": Operation not permitted", iov_len=25}, {iov_base=NULL, iov_len=0}], 2) = 25
801373 writev(2, [{iov_base="", iov_len=0}, {iov_base="\n", iov_len=1}], 2) = 1
801373 fcntl(1, F_GETFL) = 0x8002 (flags O_RDWR|O_LARGEFILE)
801373 writev(2, [{iov_base="tar: ", iov_len=5}, {iov_base=NULL, iov_len=0}], 2) = 5
801373 writev(2, [{iov_base="Exiting with failure status due "..., iov_len=50}, {iov_base=NULL, iov_len=0}], 2) = 50
801373 writev(2, [{iov_base="", iov_len=0}, {iov_base="\n", iov_len=1}], 2) = 1
801373 close(1) = 0
801373 close(2) = 0
801373 exit_group(2)
And also I realized that this syscall is not present on the system:
ausyscall --dump
Using x86_64 syscall table:
0 read
1 write
...
450 set_mempolicy_home_node
451 cachestat
The 0x1c4 shall be provided in decimal to ausyscall which translates to 452
This also explains why the Unconfined seccomprofile solved it... But why is this different in the two container runtime?
ah I think that is because runc "monkey patch" the seccomp profile to return ENOSYS by default, while crun expects it to be correct.
I assume you are using containerd for your cluster? Because CRI-O would just specify ENOSYS as the default action for the seccomp profile: https://github.com/containers/common/blob/main/pkg/seccomp/seccomp.json#L4
Yes, we are using containerd!
Thank you @giuseppe, I think we can close this one then. Do you know by accident that can we configure this for containerd too? I could find that the default action is errno: https://github.com/containerd/containerd/blob/f0a32c66dad1e9de716c9960af806105d691cd78/contrib/seccomp/seccomp_default.go#L456 But I didn't find it in any of the configs.
yes I think we can close it as it is a known difference, I disagree with the way runc does it and I'd like to not change the way crun expects just the correct configuration to be passed in.
Hello
We are trying to replace runc with crun in our k8s clusters, and in our tests we encountered a permission issue with an internal backup and restore process of products.
The specific error encountered was with a sidecar container attempting to untar an archive onto shared storage between the main and sidecar containers during restore:
'tar: data: Cannot change mode to rwxrwsr-x: Operation not permitted'
This seems to be raising form the running of this command inside the sidecar:
tar -p --use-compress-program=\"gzip -d\" -xvf 20241028102632_LOCAL_ccas_ccas-apache-0_volume.tar.gz -C untar_dir
We have confirmed the issue was not related to selinux and that the
data
directory has already hadrwxrwsr-x
permissions, and both podsrunAsUser
andrunAsGroup
values are the same (818)We are not running rootless containers, and we confirmed
run.oci.keep_original_groups=1
does not resolve the issue.After further investigation, we found that after replacing the
seccompProfile.type
value fromRuntimeDefault
toUnconfined
, the restore process completes successfully.Is there an intended difference in
RuntimeDefault
seccompProfile.type
between crun and runc? Is this expected behavior in that case?Here is the relevant statefulset