gardenlinux / gardenlinux

Garden Linux - The best Linux for Gardener nodes!
https://gardenlinux.io
MIT License
159 stars 69 forks source link

make kvm-dev results in make: *** [Makefile:138: kvm-dev] Error 1 #1097

Closed morbitzer closed 2 years ago

morbitzer commented 2 years ago

What happened:

make kvm-dev fails with make: *** [Makefile:138: kvm-dev] Error 1. The created rootfs.raw is not copied out of the container.

[...]
The partition table has been altered.
Syncing disks.
1+0 records in
1+0 records out
440 bytes copied, 2.2271e-05 s, 19.8 MB/s
131072+0 records in
131072+0 records out
67108864 bytes (67 MB, 64 MiB) copied, 0.0814029 s, 824 MB/s
1077248+0 records in
1077248+0 records out
551550976 bytes (552 MB, 526 MiB) copied, 0.644751 s, 855 MB/s
186368+0 records in
186368+0 records out
95420416 bytes (95 MB, 91 MiB) copied, 0.119208 s, 800 MB/s
a6552eecd2c04559cff5ce5a83b01729d2e5978e74698c2848f7952f6a5c036b  output/kvm_dev-amd64-dev-local/rootfs.raw
make: *** [Makefile:138: kvm-dev] Error 1

What you expected to happen:

No Error.

How to reproduce it (as minimally and precisely as possible):

Pull repo and execute make kvm-dev

Anything else we need to know:

Earlier, I was trying to add a new feature. While this worked two weeks ago, it now fails with an Error 1, and the error now also happens when I execute the make kvm-dev in a new, unmodified clone of repo. I tried to delete all docker images, but that didn't help.

Adding apparmor=1 security=apparmor to the kernel command line, as described in https://github.com/gardenlinux/gardenlinux/blob/main/docs/build/troubleshooting.md did not help either.

I did apply PR #1015, hoping it would fix the problem, but the error occurs with and without it.

Unfortunately, I am not able to determine what exactly causes the error. I tried to build with --debug, but also that does not result in any further information. When using the --manual flag, the build seems to work, and the error only occurs when exciting the container.

Environment:

docker version
Client: Docker Engine - Community
 Version:           20.10.12
 API version:       1.41
 Go version:        go1.16.12
 Git commit:        e91ed57
 Built:             Mon Dec 13 11:45:48 2021
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true

Server: Docker Engine - Community
 Engine:
  Version:          20.10.12
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.16.12
  Git commit:       459d0df
  Built:            Mon Dec 13 11:43:56 2021
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          v1.5.9
  GitCommit:        1407cab509ff0d96baa4f0eb6ff9980270e6e620
 runc:
  Version:          1.0.3
  GitCommit:        v1.0.3-0-gf46b6ba2
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0
  docker info
Client:
 Context:    default
 Debug Mode: false
 Plugins:
  app: Docker App (Docker Inc., v0.9.1-beta3)
  buildx: Docker Buildx (Docker Inc., v0.7.1-docker)
  scan: Docker Scan (Docker Inc., v0.12.0)

Server:
 Containers: 1
  Running: 0
  Paused: 0
  Stopped: 1
 Images: 22
 Server Version: 20.10.12
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: runc io.containerd.runc.v2 io.containerd.runtime.v1.linux
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 1407cab509ff0d96baa4f0eb6ff9980270e6e620
 runc version: v1.0.3-0-gf46b6ba2
 init version: de40ad0
 Security Options:
  apparmor
  seccomp
   Profile: default
 Kernel Version: 5.14.0-rc2-sev-snp-part2-v6
 Operating System: Debian GNU/Linux 11 (bullseye)
 OSType: linux
 Architecture: x86_64
 CPUs: 48
 Total Memory: 250.6GiB
 Name: epyc-snp-03
 ID: OV7D:QUFV:73DV:3SGC:QSSI:MZYI:ZILE:MTXY:A5VM:UFDS:XH3O:PHHV
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false
Vincinator commented 2 years ago

Hi @morbitzer, thanks for the issue description.

I could not reproduce the issue on my test machine. I tested with commit caa5163b (current latest commit in main branch). Could you also attach the full build log? A log file should be available in the <repo>/.build/ folder. I think make kvm-dev > debug.log 2>&1 should also be fine.

Thanks, Vincent

gyptazy commented 2 years ago

Hey @morbitzer,

thanks for creating an issue. Unfortunately, I was unable to reproduce this issue as well as @Vincinator already mentioned. Beside this, our GitHub pipeline doesn't encounter any issues here (based on the last merged commit). However, you might debug this by setting set -x in build.sh, bin/garden-build, bin/garden-chroot, bin/garden-init (and other files that are invoked during build and changed by you).

Unfortunately, I couldn't find any branch from you. If it's ok for you, you might share (push) it so we may have a look. But it's weird that you still have issues on a fresh repo clone.

Edit: Just saw you're using docker, instead of Podman. Unfortunately I was also unable to reproduce this with docker as CRE. Maybe you could just give it a new try with Podman. When building with docker - is export GARDENLINUX_BUILD_CRE=docker set? Is your git checkout located on a shared volume or getting shared to any other tools like Vagrant etc.? I think defining set -x could lead to the fastest solution to find the issue.

Regards, gyptazy

morbitzer commented 2 years ago

It turned out that I was accidentally not working on the latest commit - I was using commit 118d3b65666dd18e7d01a2a9f0b2d45d4de46862. While this commit worked for me 2 weeks ago, and now doesn't anymore - I suspect a system update to be the reason for this, but I'm not sure. Commit 3f532ab2e984696b4b77cec184efc63d046a4b03 didn't fix the issue, so it seems to be a different issue related to system-updates as is #1014. However, the good news is that using the latest commit (f111b51fb243f9e41d8e1dcfe6b2f2b862135f33), the error is gone. Using git bisect, I was able to figure out that ca5ef159fbf27e02181f54e539452c45b8849e9e fixes the error. Yet, as this is a quite large commit and the error is resolved, I didn’t dive into what exactly fixed it.

Sorry for the inconvenience, I thought I was working on an up-to-date version, but well, I didn't.

Btw, @gyptazy I learned that adding set -x to /bin/garden-chroot doesn't seem to be a good idea. The output of the script is used by other scripts such as bin/garden-slimify, so set -x will break these. So for everyone coming here due to searching the error xargs: unmatched single quote; by default quotes are special to xargs unless you use the -0 option, remove the set -x from /bin/garden-chroot.