intel / intel-device-plugins-for-kubernetes

Collection of Intel device plugins for Kubernetes
Apache License 2.0
48 stars 205 forks source link

Fix buildah build errors in github #1780

Closed tkatila closed 4 months ago

tkatila commented 4 months ago

Some background: Buildah builds started to fail suddenly. I don't exactly know why as nothing had changed in terms of source code. After some debugging and trials, I got the builds to pass again by tuning the umask and requesting tar not to force permissions or users.

tkatila commented 4 months ago

The error is seen here: https://github.com/intel/intel-device-plugins-for-kubernetes/actions/runs/9968340803/job/27561694980

I checked various things:

As the build works fine on my laptop and with docker, I'm quite sure it's about the github environment. But without pinpointing a component, it's hard to file an issue.

eero-t commented 4 months ago

That's really odd. Package installation also sets file permissions and owners, and when that succeeds in doing that, I don't see how tar would fail in it.

Hm. Unlike Docker, buildah allows using volumes also for building. Could parts of build container where packages are installed, and where tarball are extracted, be mounted separately, with e.g. different mount options?

In theory it would also be possible for e.g. seccomp filter being applied only to specific programs, but security-wise I don't see point of allowing mode changes for arbitrary dpkg commands, but blocking them for tar...

Changing only umask didn't work. Nor did only having the extra tar options. Both were required.

Was there some other (later) error when using just tar options?

mythi commented 4 months ago

That's really odd.

I agree with @eero-t here that it'd be important to understand what is going on instead of taking what makes the error go away. I'll look into it a bit

mythi commented 4 months ago

We have used this in the past and seems to be working this time also: https://github.com/intel/intel-device-plugins-for-kubernetes/pull/1783/commits/edee81a73df6d88b4e93122555e9b83665ab4a63.

eero-t commented 4 months ago

We have used this in the past and seems to be working this time also: edee81a.

What's the default runtime, if it's not runc? crun?

Fedora has been using latter for a long time, and it defaults to user namespaces. I still don't know why apt would succeed under it, but tar would not, though.

mythi commented 4 months ago

We have used this in the past and seems to be working this time also: edee81a.

What's the default runtime, if it's not runc? crun?

The failing build uses crun.

Fedora has been using latter for a long time, and it defaults to user namespaces. I still don't know why apt would succeed under it, but tar would not, though.

Note that ubuntu-22.04 uses an ancient buildah from late 2021. I'm not able to reproduce the error locally but I have two fix alternatives: we can move to ubuntu-24.04 runners which gives much more recent version or we add BUILDAH_RUNTIME=runc just like the plugin images have.

mythi commented 4 months ago

I'm not able to reproduce the error locally but I have two fix alternatives: we can move to ubuntu-24.04 runners which gives much more recent version or we add BUILDAH_RUNTIME=runc just like the plugin images have.

update: I can reproduce with crun running vanilla Ubuntu 22.04 but I'm not sure if it makes sense to debug much further.