Open c3d opened 1 year ago
Exploring a little more, it seems to be unclear that podman
has provisions for not invoking conmon
at all.
In cri-o/internal/oci/oci.go
, we have code that selects a different implementation path when the runtime type is "vm"
:
func (r *Runtime) newRuntimeImpl(c *Container) (RuntimeImpl, error) {
rh, err := r.getRuntimeHandler(c.runtimeHandler)
if err != nil {
return nil, err
}
if rh.RuntimeType == config.RuntimeTypeVM {
return newRuntimeVM(rh.RuntimePath, rh.RuntimeRoot, rh.RuntimeConfigPath, r.config.RuntimeConfig.ContainerExitsDir), nil
}
if rh.RuntimeType == config.RuntimeTypePod {
return newRuntimePod(r, rh, c)
}
// If the runtime type is different from "vm", then let's fallback
// onto the OCI implementation by default.
return newRuntimeOCI(r, rh), nil
}
In containerd/pkg/cri/server/helpers_linux.go
, the approach is different,
since it does not rely on the "vm"
string in the runtime type but on the
io.containerd.kata
name:
var vmbasedRuntimes = []string{
"io.containerd.kata",
}
func isVMBasedRuntime(runtimeType string) bool {
for _, rt := range vmbasedRuntimes {
if strings.Contains(runtimeType, rt) {
return true
}
}
return false
}
Could not find any equivalent code in podman
.
The message for podman
is coming from podman/libpod/oci_conmon_common.go
.
func (r *ConmonOCIRuntime) createOCIContainer(ctr *Container, restoreOptions *ContainerCheckpointOptions) (int64, error) {
...
logrus.WithFields(logrus.Fields{
"args": args,
}).Debugf("DDD1: running conmon: %s", r.conmonPath)
cmd := exec.Command(r.conmonPath, args...)
cmd.SysProcAttr = &syscall.SysProcAttr{
Setpgid: true,
}
...
}
There does not seem to be any provision to not run conmon
without modifying podman
. So there are a few options:
podman
to avoid running conmon
and trigger the error abovepodman
but find a way to inject an alternate conmon
that does
not error out when calledconmon
error.Likely call stack in podman
:
// podman/libpod/oci_conmon_common.go:202
return r.createOCIContainer(ctr, restoreOptions)
This is called from line 183:
// CreateContainer creates a container.
func (r *ConmonOCIRuntime) CreateContainer(ctr *Container, restoreOptions *ContainerCheckpointOptions) (int64, error) {
// always make the run dir accessible to the current user so that the PID files can be read without
// being in the rootless user namespace.
if err := makeAccessible(ctr.state.RunDir, 0, 0); err != nil {
return 0, err
}
if !hasCurrentUserMapped(ctr) {
for _, i := range []string{ctr.state.RunDir, ctr.runtime.config.Engine.TmpDir, ctr.config.StaticDir, ctr.state.Mountpoint, ctr.runtime.config.Engine.VolumePath} {
if err := makeAccessible(i, ctr.RootUID(), ctr.RootGID()); err != nil {
return 0, err
}
}
I see nothing in that path that seems to be able to avoid conmon
at all.
The interface seems to be from ConmonOCIRuntime
. So do we need to build a non-conmon
OCI runtime? There is a OCIRuntime
interface (podman/libpod/oci.go
).
There seems to be a notion of "non-legacy OCI runtime" notion (podman/libpod/boltdb_state_internal.go
):
// Handle legacy containers which might use a literal path for
// their OCI runtime name.
runtimeName := ctr.config.OCIRuntime
ociRuntime, ok := s.runtime.ociRuntimes[runtimeName]
if !ok {
runtimeSet := false
Curious why there is a sqllite_state_internal.go
and boltdb_state_internal.go
. They seem very similar, and contain logic that does not seem related to databases at all. Notably, the creation of the runtime objects (and logic such as path lookup) are in these files, see getCOntainerStateDB
or finalizeCtrSqlite
calling newConmonOCIRuntime
(the other place being in libpod/runtime.go
which seems a bit more logical).
A bit weird.
Apparently, no OCI runtime is created other than through newConmonOCIRuntime
, and the conmon
path is always passed through an argument to that function, from runtime.conmonPath
.
Exit command arguments, passed with --exit-command
and --exit-command-arg
:
/usr/local/bin/podman --root /var/lib/containers/storage --runroot /var/run/containers/storage --log-level debug --cgroup-manager systemd --tmpdir /var/run/libpod --network-config-dir --network-backend cni --volumepath /var/lib/containers/storage/volumes --db-backend boltdb --transient-store=false --runtime /home/ddd/Work/ociplex/ociplex/run-kata --storage-driver overlay --storage-opt overlay.mountopt=nodev,metacopy=on --events-backend journald --syslog container cleanup bfdf589efbf1ecd78142e44f7add27de199567b5d50a310b94535ae7ec23ffe8
So when the container dies, we call podman container cleanup
. Also gives interesting insights about the database being used. But it looks like all this should be passed directly to run-kata
, not through podman
.
Running
_ podman --log-level=debug --runtime $PWD/run-kata run -it fedora bash
, I see the following in the logs:It's unclear to me if
conmon
should actually be running in a shim-v2 scenario. Also what is the--api-version 1
? What API is this referring to?