nestybox / sysbox

An open-source, next-generation "runc" that empowers rootless containers to run workloads such as Systemd, Docker, Kubernetes, just like VMs.
Apache License 2.0
2.8k stars 155 forks source link

Pulling image `gcr.io/cloud-foundation-cicd/cft/developer-tools` inside the Sysbox container fails #443

Closed jawnsy closed 2 years ago

jawnsy commented 2 years ago

I'm seeing this error, unsure whether the problem is in sysbox or something else. Sharing details here to triage.

The Google Project Factory Terraform module has a lint step, which runs a public container image to generate docs and do code formatting. This errors out, and I'm not too sure why.

$ make docker_generate_docs
docker run --rm -it \
        -v "/home/coder/projects/terraform-google-project-factory":/workspace \
        gcr.io/cloud-foundation-cicd/cft/developer-tools:1 \
        /bin/bash -c 'source /usr/local/bin/task_helper_functions.sh && generate_docs'
Unable to find image 'gcr.io/cloud-foundation-cicd/cft/developer-tools:1' locally
1: Pulling from cloud-foundation-cicd/cft/developer-tools
9d48c3bd43c5: Pulling fs layer 
9ce9598067e7: Pulling fs layer 
278f4c997324: Pulling fs layer 
bfca09e5fd9a: Waiting 
2612f15b9d22: Waiting 
fb7eea9f0ca5: Waiting 
bcf9d515a31e: Waiting 
c8758f08eca4: Waiting 
22305b9467e5: Waiting 
848ebb935f8a: Waiting 
ded95484e2d0: Waiting 
8ac0ddd72853: Pulling fs layer 
73a0b2e2eaa1: Waiting 
cd161d4c1a08: Waiting 
0f97353a7dfc: Waiting 
3e26cd0e2056: Waiting 
3cb4f5360d60: Waiting 
a42d80540c28: Waiting 
3de689ae0aca: Pulling fs layer 
9854f25cd371: Waiting 
cc4b7291d29e: Waiting 
9c083a6ba0a2: Waiting 
433cb2a63bea: Waiting 
f8125bca9ea8: Pulling fs layer 
9a1bc2f6c7e0: Waiting 
cd161d4c1a08: Extracting [==================================================>]  91.38MB/91.38MB
ae438c07c5e8: Download complete 
ccbf02582392: Download complete 
6a085c416fe6: Download complete 
deae6149b2f9: Download complete 
5a72fbe02eb5: Download complete 
e142143a941e: Download complete 
900c99f9fa3d: Download complete 
8478c1e94908: Download complete 
387a27ed1f91: Download complete 
360c6c6fb5d0: Download complete 
ef03a82b200c: Download complete 
3e99f74a1451: Download complete 
5730ee6db383: Download complete 
508efcc2580e: Download complete 
cc12d5309308: Download complete 
6d5efbc40de7: Download complete 
af4ca2898db2: Download complete 
7871a21c057f: Download complete 
354a081d58d0: Download complete 
5529fa739802: Download complete 
cb38c93fc9de: Download complete 
c2954cb4f443: Download complete 
6d2b501d36fa: Download complete 
7d8b06dfdba5: Download complete 
23541b6e9db5: Download complete 
a0d1957600ba: Download complete 
fe33f1b95794: Download complete 
c80a35f84b9a: Download complete 
b0b2ab940d36: Download complete 
eecd91f90b33: Download complete 
de0a7a848979: Download complete 
f701394f6945: Download complete 
docker: failed to register layer: ApplyLayer exit status 1 stdout:  stderr: lchown /build/terraform-validator: invalid argument.
See 'docker run --help'.
make: *** [Makefile:92: docker_generate_docs] Error 125

Here's my docker system info output, note that I'm running docker under sysbox in Coder:

$ docker system info
Client:
 Context:    default
 Debug Mode: false
 Plugins:
  app: Docker App (Docker Inc., v0.9.1-beta3)
  buildx: Build with BuildKit (Docker Inc., v0.6.3-docker)

Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 0
 Server Version: 20.10.11
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: false
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 7b11cfaabd73bb80907dd23182b9347b4245eb5d
 runc version: v1.0.2-0-g52b36a2
 init version: de40ad0
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 5.4.0-1051-gke
 Operating System: Ubuntu 20.04.3 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 32
 Total Memory: 251.9GiB
 Name: jawnsy-m
 ID: OTN2:D3ZN:5YUY:B5IA:DJJP:ZPDX:VDX3:JOCY:XCAC:QO3M:PZDO:4JW5
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Username: jawnsy
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Registry Mirrors:
  https://mirror.gcr.io/
 Live Restore Enabled: false

WARNING: No swap limit support

Available tags

Here are some of the visible tags:

$ gcloud container images list-tags gcr.io/cloud-foundation-cicd/cft/developer-tools
DIGEST        TAGS           TIMESTAMP
9d5f52f98a91  1,1.2,1.2.3    2021-10-20T01:59:21
5e370ded8ecf  1.1,1.1.3      2021-09-10T04:52:25
f38d398dc335  1.1.0          2021-08-25T20:42:45
2f153d182edf  1.0,1.0.16     2021-08-25T12:08:38
932a30ca0e49  1.0.15         2021-08-19T02:30:06
f524e6af4335  1.0.14         2021-08-18T06:18:08
f337ceed9a21  1.0.12         2021-08-05T04:12:23
82b471eedc70  1.0.11         2021-08-04T06:40:44
b5c705a5ceaa  1.0.10         2021-07-28T18:07:21
317e07e4fd82                 2021-07-28T03:02:34
77728d774b1a                 2021-07-28T02:38:05
71dece4aacc6  1.0.9          2021-07-22T05:19:55
76884fd406e7  1.0.8          2021-07-21T02:50:15
799e4d3a1ea3  1.0.7          2021-07-14T04:53:28
f4c814928238  1.0.6          2021-07-13T15:45:19
8fadfddda6b8                 2021-07-13T04:13:31
75b80a78227f                 2021-07-08T02:26:16
22113ceabae0  1.0.5          2021-06-30T03:02:07
76fbb1ec4152  1.0.4          2021-06-25T02:24:39
71f0f4a1f0dc  1.0.3          2021-06-23T07:01:41
dc4ffe9e8f8a  1.0.2          2021-06-16T03:13:22
9c8b001da390  1.0.1          2021-06-11T20:06:58
13347ad2cdfc  1.0.0          2021-06-11T20:03:46

Notes

ctalledo commented 2 years ago

Hi @jawnsy , I was able to repro easily with:

$ docker run --runtime=sysbox-runc -it --rm nestybox/ubuntu-focal-systemd-docker

# Inside the container:

root@85d93eb89f98:~# docker pull gcr.io/cloud-foundation-cicd/cft/developer-tools:1
1: Pulling from cloud-foundation-cicd/cft/developer-tools
9d48c3bd43c5: Pull complete 
9ce9598067e7: Pull complete 
278f4c997324: Pull complete 
...
failed to register layer: ApplyLayer exit status 1 stdout:  stderr: lchown /build/terraform-validator: invalid argument

Will take a look to see what's going on ...

ctalledo commented 2 years ago

FYI: possible duplicate of issue #187, but will investigate further to confirm.

ctalledo commented 2 years ago

Update: I straced the docker pull that fails, and it fails here:

5078  fchownat(AT_FDCWD, "/build/terraform-validator", 806984, 89939, AT_SYMLINK_NOFOLLOW <unfinished ...>
5078  <... fchownat resumed>)           = -1 EINVAL (Invalid argument)

In contrast, when the docker pull is done outside a Sysbox container, that same instruction works:

8621410:203602 fchownat(AT_FDCWD, "/build/terraform-validator", 806984, 89939, AT_SYMLINK_NOFOLLOW <unfinished ...>
8621421:203602 <... fchownat resumed>)          = 0

I don't see mknod (or the lack of it) as causing the problem, so the error looks different from issue #187.

ctalledo commented 2 years ago

I see the problem: in the fchown syscall:

fchownat(AT_FDCWD, "/build/terraform-validator", 806984, 89939, AT_SYMLINK_NOFOLLOW <unfinished ...>

the 3rd and 4th params are the uid:gid. These look totally incorrect (they should have probably been set to 0:0 instead).

When running inside a Sysbox container, the user-IDs have a range of 65536, so I suspect the chown to a uid:gid outside this range is causing the kernel to return EINVAL.

For this same reason Podman + rootless also fails:

Error processing tar file(exit status 1): potentially insufficient UIDs or GIDs available in user namespace (requested 806984:89939 for /build/terraform-validator): Check /etc/subuid and /etc/subgid: lchown /build/terraform-validator: invalid argument

At host level, the user-IDs have a range of 2^32, so this is not a problem.

I think we can fix this is Sysbox, by ensuring that chowns that exceed 65535 are capped at the 65536 user ID (i.e., nobody).

jawnsy commented 2 years ago

@ctalledo Do you have any idea why this fails even before the container is created, at pull time? Is this because the new image was built with a new version of docker?

The image in question doesn't seem to be doing anything particularly special, it's just extracting the binary. But it's possible that images built with an older docker version work fine, and images built with a new one result in this chown happening at pull time, and thus failing?

Is this a bug in docker or runc somewhere? I imagine the former, since it happens at pull time?

ctalledo commented 2 years ago

Hi @jawnsy,

Do you have any idea why this fails even before the container is created, at pull time?

During the pull, Docker extracts the layers that make up the image. It is during that extraction that we see the fchownat() syscall with weird uid:gid (i.e., 806984:89939).

I don't know where these weird uid:gid come from; they certainly look incorrect. I don't know if the come from the image layers themselves or if it's a bug in Docker's image extraction code. I suspect it's the former.

Is this because the new image was built with a new version of docker?

Don't know.

Is this a bug in docker or runc somewhere? I imagine the former, since it happens at pull time?

runc is not involved at this stage, so likely it's a problem in the image itself or in the Docker extraction code as I mentioned above.

ctalledo commented 2 years ago

I think we can fix this is Sysbox, by ensuring that chowns that exceed 65535 are capped at the 65536 user ID (i.e., nobody).

It's possible to fix this in Sysbox, but it requires trapping the chown syscall (which Sysbox currently does for a different reason), but we've learned that this can result in bad performance when programs inside the Sysbox container do lots of chown.

In addition, given that this problem appears to be specific to pulling the gcr.io/cloud-foundation-cicd/cft/developer-tools:1 image inside the sysbox container (we have no other reports of such an error), I am wondering if it's worth fixing ...

jawnsy commented 2 years ago

I don't know whether the problem is with the image itself or with the tools used to build it, and if the latter, then other images might be affected too. I agree that it may not make sense to fix this, unless it turns out to be a more pervasive problem than just this single image. I'm fine with closing this issue out.

Thanks for taking a look!

ctalledo commented 2 years ago

Thanks @jawnsy. Let's keep the issue open in case someone else hits it, and in case we find a way to fix it without impacting performance. Won't attempt to fix it now though, unless as you mentioned it turns out to be a more pervasive problem (there is no evidence of that currently).

ctalledo commented 2 years ago

Closing as there is no action item to fix this (and in fact the problem is specific to a particular image).