hashicorp / nomad-driver-podman

A nomad task driver plugin for sandboxing workloads in podman containers
https://developer.hashicorp.com/nomad/plugins/drivers/podman
Mozilla Public License 2.0
224 stars 61 forks source link

rootless podman fails with: {"cause":"operation not permitted","message":"lsetxattr /opt/nomad/data/alloc/5db64440-ad18-56f7-0cc6-608d2a2b3ccf/alloc: operation not permitted","response":500} #232

Closed erikschul closed 11 months ago

erikschul commented 1 year ago

My VM has the following setup:

When scheduling a basic demo job, it fails with the message:

client.alloc_runner.task_runner: Task event: alloc_id=5db64440-ad18-56f7-0cc6-608d2a2b3ccf task=bannertask type="Driver Failure"
rpc error: code = Unknown desc = failed to start task, could not start container: cannot start container, status code: 500: {"cause":"operation not permitted","message":"lsetxattr /opt/nomad/data/alloc/5db64440-ad18-56f7-0cc6-608d2a2b3ccf/alloc: operation not permitted","response":500}

When running ls -l /opt/nomad/data/alloc/, it shows that:

drwx------. 4 nomad nomad 37 Apr  4 23:04 12301492-2b45-7c39-5db6-66909cc72bfc
drwxr-xr-x. 4 nomad nomad 37 Apr  5 00:01 34f03670-e80d-26cc-f7c7-484c26eb5eb3
drwxr-xr-x. 4 root  root  37 Apr  5 00:36 5db64440-ad18-56f7-0cc6-608d2a2b3ccf
drwxr-xr-x. 4 nomad nomad 37 Apr  4 23:58 91cc6d6b-3569-b133-bc4b-1fd007d46ce3
drwx------. 4 nomad nomad 37 Apr  4 23:47 aa09251e-0650-1bf7-cb56-4af805e327ce

Perhaps the problem is that the Nomad client runs as root, and creates the folder in alloc, which nomad user doesn't have privileges in?

I haven't explicitly configured fuse-overlayfs or crun or container_manage_cgroup. Could that be the cause?

Possibly related issues:

erikschul commented 1 year ago

If I remove selinuxlabel, I get this error instead:

| rpc error: code = Unknown desc = failed to start task, could not start container: cannot start container, status code: 500: {"cause":"broken pipe","message":"write child: broken pipe","response":500}
erikschul commented 1 year ago

Is it possible that the problem is, that nomad-driver-podman creates the /opt/nomad/data/alloc/c130a67b-4ff5-4ef7-9317-d57ecb5d37f8 directory as root:root (and drwxr-xr-x), when it should be created as nomad:nomad? (the user should obviously be configurable), which prevents podman from running lsetxattr?

I guess this isn't supported? https://github.com/hashicorp/nomad-driver-podman/issues/84

If that's the case (since 2021?), perhaps it could be made more clear in the README that rootless requires the nomad client to also be run as the same user? (which then causes other problems relating to volume mounts and network configuration)

erikschul commented 1 year ago

It works when the nomad client service is run as nomad, and as expected, the folders in /opt/nomad/data/alloc/ have nomad:nomad ownership.

But is the bug with nomad or nomad-driver-podman? I assume nomad is responsible for creating the folder in alloc?