Open MikeDombo opened 3 years ago
It seems that the issue is that openat2
is available after the update (openat2
was added in Linux 5.6). However this code definitely works on both old and new kernels, so it's a bit puzzling why it's failing for you. Does your code mess around with open files at all? With openat2, we cache a file descriptor to /sys/fs/cgroup
-- so if some of your code is dup-ing over that file descriptor with another file descriptor, you're going to end up with the cgroup2 code trying to open /some/random/file/cpuset/container-id/cpuset.cpus
which will either fail or cause security issues.
(As an aside @kolyshkin I think that the error message in the openat2 code should be changed to better describe the path being opened -- readlink
of the fd would be a good idea. I'll write a patch for that.)
Hi, it seems that you've opened an issue that is in relation to using github.com/opencontainers/runc/libcontainer
directly from your own Go code. While we may do our best to help you solve your issue, we do not support this. In addition, our Semantic Versioning version does not include the Go API in its scope -- meaning we are free to break the Go APIs in point releases.
The reason for this policy is that runc has fairly complicated behaviour that is shared between the runc
command-line code and the internal libcontainer
libraries, as well as a fairly complicated lifecycle that is very easy to not configure correctly. In addition, we have several security hardenings within runc that could be broken inadvertently by code using libcontainer
and not being aware of this.
libcontainer
being a public-accessible library is a historical accident and in the future we may put it entirely inside an internal
package to stop it from being imported entirely. If you can use runc
, or the go-runc
library (which wraps the runc
binary) please do so. If you cannot, please open an issue describing why you cannot use runc
and we will see if it's possible to expand runc
so that you can use runc
directly.
Hi @cyphar thank you very much for both of the replies.
So to test the hypothesis I changed prepareOpenat2
to always return an error so that we'd just use good ol' open and that actually solved the problem!
I don't believe we're messing around with anything, duping or otherwise, aside from running the container root in an overlayfs which is maybe related.
What happens if you apply this patch?
diff --git a/libcontainer/cgroups/fscommon/open.go b/libcontainer/cgroups/fscommon/open.go
index e95876a21769..2ef1e15cfae8 100644
--- a/libcontainer/cgroups/fscommon/open.go
+++ b/libcontainer/cgroups/fscommon/open.go
@@ -1,6 +1,7 @@
package fscommon
import (
+ "fmt"
"os"
"strings"
"sync"
@@ -86,7 +87,13 @@ func OpenFile(dir, file string, flags int) (*os.File, error) {
Mode: uint64(mode),
})
if err != nil {
- return nil, &os.PathError{Op: "openat2", Path: dir + "/" + file, Err: err}
+ procpath := fmt.Sprintf("/proc/self/fd/%d", cgroupFd)
+ realdir, _ := os.Readlink(procpath)
+ path := dir + "/" + file
+ if realdir != dir {
+ path = fmt.Sprintf("[%s=%d!=%s]/%s", procpath, realdir, dir, file)
+ }
+ return nil, &os.PathError{Op: "openat2", Path: path, Err: err}
}
return os.NewFile(uintptr(fd), cgroupfsPrefix+relname), nil
[/proc/self/fd/30=/!=/sys/fs/cgroup/cpuset/9ac5cec4-16ac-4fdf-4f7f-51aa982a9b7f]/cpuset.cpus: no such file or directory
So apparently what should be the cgroup path is actually /
Hmm, so it seems that fd 30 is being swapped with a handle to /
or something like that? You can try to add some similar readlink-based debugging code to prepareOpenat2
to see whether the original handle is correct.
Thank you so much Aleksa, I don't want to take up more of your valuable time with this. Your pointer to look at the openat2 call certainly saved hours of work, so thank you for that! I'll keep looking into it, but worst case we just use the fallback path without openat2 which seems like it should be fine.
I'd suspect the FD is getting messed up either in entering the mount namespace or mounting the overlayfs.
I'd suspect the FD is getting messed up either in entering the mount namespace or mounting the overlayfs.
Apparently we did not have such a use case in mind. runc works with cgroup files before entering any namespaces.
I guess we can check on error if cgroupFd
still resolves to /sys/fs/cgroup
, and reopen it if not, but that seems overly complicated. I actually wrote this but it looks borderline ugly.
@MikeDombo If the source code is available, I will be happy to help resolve this.
Thank you for looking Kir. Sadly our usage isn't open source at this time. Hopefully when we do open source it we'll have moved to using runc properly instead of going directly to libcontainer.
Hi, I'm using libcontainer to create my container and I found a regression when Ubuntu updated from kernel 5.4 to kernel 5.8 on Thursday/Friday. I'm currently seeing
process_linux.go:385: applying cgroup configuration for process caused: openat2 /sys/fs/cgroup/cpuset/<container id>/cpuset.cpus: no such file or directory
. On kernel 5.4 using the exact same code it work fine. I edited libcontainer to do a directory listing and I can verify that thecpuset.cpus
"file" does exist as it should, however it is unable to be opened by libcontainer. I am able to runcat
on the path without error from outside the container.Any pointers would be incredibly helpful, thank you very much.