ocaml / infrastructure

WIki to hold the information about the machine resources available to OCaml.org
40 stars 9 forks source link

Unable to extract tar files on Ubuntu Noble #121

Closed mtelvers closed 1 month ago

mtelvers commented 1 month ago

On POWER9 and RISCV, we are unable to extract files from a tar file when using a Ubuntu Noble. Running the same commands using ubuntu:jammy works fine.

root@orithia:~# uname -a
Linux orithia 5.15.0-107-generic #117-Ubuntu SMP Fri Apr 26 12:26:49 UTC 2024 ppc64le ppc64le ppc64le GNU/Linux
root@orithia:~# docker run --rm -it ubuntu:noble
root@cf3491db4abd:/# cd  
root@cf3491db4abd:~# mkdir foo 
root@cf3491db4abd:~# tar -cf bar.tar foo
root@cf3491db4abd:~# rmdir foo
root@cf3491db4abd:~# tar -xf bar.tar 
tar: foo: Cannot change mode to rwxr-xr-x: Operation not permitted
tar: Exiting with failure status due to previous errors

This issue prevents the merging of https://github.com/ocurrent/docker-base-images/pull/275, and the RISCV64 image builder here: https://github.com/mtelvers/docker-base-images/pull/1.

Running strace shows the issue.

root@orithia:~# docker run --rm -it --cap-add=SYS_PTRACE ubuntu:noble
root@50b58f5cd463:/# apt update && apt install strace -y
root@ 50b58f5cd463:/# cd  
root@ 50b58f5cd463:~# mkdir foo 
root@ 50b58f5cd463:~# tar -cf bar.tar foo
root@ 50b58f5cd463:~# rmdir foo
root@50b58f5cd463:~# strace -f tar -xf bar.tar

The relevant part of the output shows fchmodat2 returned EPERM.

mkdirat(AT_FDCWD, "foo", 0700)          = 0
close(3)                                = 0
utimensat(AT_FDCWD, "foo", [UTIME_OMIT, {tv_sec=1716452406, tv_nsec=0} /* 2024-05-23T08:20:06+0000 */], AT_SYMLINK_NOFOLLOW) = 0
fchownat(AT_FDCWD, "foo", 0, 0, AT_SYMLINK_NOFOLLOW) = 0
fchmodat2(AT_FDCWD, "foo", 0755, AT_SYMLINK_NOFOLLOW) = -1 EPERM (Operation not permitted)
write(2, "tar: ", 5tar: )                    = 5
write(2, "foo: Cannot change mode to rwxr-"..., 36foo: Cannot change mode to rwxr-xr-x) = 36
write(2, ": Operation not permitted", 25: Operation not permitted) = 25
write(2, "\n", 1

If you run strace without Docker you see a different behaviour:

mkdirat(AT_FDCWD, "foo", 0700)          = 0
close(3)                                = 0
utimensat(AT_FDCWD, "foo", [UTIME_OMIT, {tv_sec=1716452893, tv_nsec=0} /* 2024-05-23T08:28:13+0000 */], AT_SYMLINK_NOFOLLOW) = 0
fchownat(AT_FDCWD, "foo", 0, 0, AT_SYMLINK_NOFOLLOW) = 0
openat(AT_FDCWD, "foo", O_RDONLY|O_NOFOLLOW|O_CLOEXEC|O_PATH) = 3
newfstatat(3, "", {st_mode=S_IFDIR|0700, st_size=4096, ...}, AT_EMPTY_PATH) = 0
chmod("/proc/self/fd/3", 0755)          = 0

The problem can be attributed to Docker's seccomp profile. A quick work around is to invoke Docker without a seccomp profile. Like this: docker run --rm -it --security-opt seccomp=unconfined ubuntu:noble. With no profile, fchmodat2 returns ENOSYS and tar works correctly.

mkdirat(AT_FDCWD, "foo", 0700)          = 0
close(3)                                = 0
utimensat(AT_FDCWD, "foo", [UTIME_OMIT, {tv_sec=1716454178, tv_nsec=0} /* 2024-05-23T08:49:38+0000 */], AT_SYMLINK_NOFOLLOW) = 0
fchownat(AT_FDCWD, "foo", 0, 0, AT_SYMLINK_NOFOLLOW) = 0
fchmodat2(AT_FDCWD, "foo", 0755, AT_SYMLINK_NOFOLLOW) = -1 ENOSYS (Function not implemented)
openat(AT_FDCWD, "foo", O_RDONLY|O_NOFOLLOW|O_CLOEXEC|O_PATH) = 3
newfstatat(3, "", {st_mode=S_IFDIR|0700, st_size=4096, ...}, AT_EMPTY_PATH) = 0
chmod("/proc/self/fd/3", 0755)          = 0

Armed with this investigation there are lots of related posts

Ultimately, the comment from here

since we're seeing this also on ppc64le -- can we just update the default profile to include fchmodat2 ? It looks like this one just recently got picked up: https://github.com/seccomp/libseccomp/issues/406#issuecomment-1836895522

and the reply, give us a solution.

It's already in the profile (for engine v25.0.3 and up); https://github.com/moby/moby/pull/47344

Viz., the developers have resolved the problem with the release of libseccomp v2.5.5 and Docker 25.0.3+. However, on Ubuntu Noble, we have the right version of libseccomp, 2.5.5-1ubuntu3, but only Docker 24.0.7-0ubuntu4.

We need to run Docker 24.0.7 with the updated seccomp profile from Docker 25.0.3:-

curl https://raw.githubusercontent.com/moby/moby/master/profiles/seccomp/default.json -o /etc/docker/profile.json
sed -i '/^ExecStart=/ s/$/ --seccomp-profile \/etc\/docker\/profile.json/' /lib/systemd/system/docker.*
systemctl daemon-reload
service docker restart
mtelvers commented 1 month ago

Celebrations may be premature as Docker BuildKit ignores the seccomp profile set on the daemon and instead uses the default profile at all times. Running the steps manually via docker run now works, but docker build . does not, unless we run BUILDKIT=0 docker build .

shonfeder commented 1 month ago

We happened to talk thru this just now. Sharing the upshot, mostly just notes on stuff @mtelvers explained to me:

Along the way we also looked at Dockerfile_opam.Distro.distro_arched and concluded that this is definitely not the place to remove support for riscv (if we need to), because -- in our reading -- the intended semantics of that function is based on the architectures supported by the distro, without regard to the particular build of docker you happen to be running to build. So if (2) proves necessary, we will filter out the architecture in the build pipeline.

mtelvers commented 1 month ago

Docker's default build, make build, uses Docker and does not support RISCV. However, scripts are provided to build Docker from scratch. In a minimal set of steps:-

apt install golang
git clone https://github.com/moby/moby
cd moby
AUTO_GOPATH=1 ./hack/make.sh binary
mv bundles/binary-daemon/* /usr/bin/
service docker restart

Upgrade libseccomp2 >= 2.5.5 and upgrade Docker >= 25.0.3