containers / podman

Podman: A tool for managing OCI containers and pods.
https://podman.io
Apache License 2.0
23.44k stars 2.38k forks source link

Errors when running podman container at NFS storage #8521

Closed sampie closed 3 years ago

sampie commented 3 years ago

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

Steps to reproduce the issue:

  1. I mounted NFS share at /storage

  2. I have following content at: /home/sami/.config/containers/storage.conf

[storage] graphroot = "/storage/sami/" [storage.options.overlay] ignore_chown_errors = "true"

  1. I am running: podman run -it ubuntu /bin/bash

Describe the results you received:

I got an error message:

Writing manifest to image destination Storing signatures Error processing tar file(exit status 1): lchown /etc/gshadow: operation not permitted

Describe the results you expected:

Entering container.

Additional information you deem important (e.g. issue happens only occasionally):

Output of podman version:

podman version 2.0.6

Output of podman info --debug:

(paste your output here)

Package info (e.g. output of rpm -q podman or apt list podman):

podman/groovy,now 2.0.6+dfsg1-1ubuntu1 amd64 [installed]

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide?

No

Additional environment details (AWS, VirtualBox, physical, etc.):

Host is Ubuntu 20.10 running on KVM.

mheon commented 3 years ago

@vrothberg PTAL

vrothberg commented 3 years ago

Thanks for reaching out. The problem with NFS is that it has no idea about the user mapping (user namespaces) on our client machines. There is some promising work going on but it will take time to bubble up the stack.

Can you try running with --userns=keep-id? This will create a user namespace with your current UID and GID. That will work as the NFS server knows them. Using --userns=keep-id also means that we have only 1 UID/GID available. If the container image (or your workload) requires more than just 1, it would by default error out as there are not enough IDs. In that case, we need to tweak the storage.conf as you did and set ignore_chown_errors = "true".

Please let us know if --userns=keep-id worked. Until then, I leave the issue open.

sampie commented 3 years ago

I added the parameter, the run command is now: "podman run --userns=keep-id -it ubuntu /bin/bash". However, this also gives: "Storing signatures Error processing tar file(exit status 1): lchown /etc/gshadow: operation not permitted " I also tried with centos image: "podman run --userns=keep-id -it centos /bin/bash". This gave: "Storing signatures Error processing tar file(exit status 1): open /root/.bash_logout: permission denied"

vrothberg commented 3 years ago

@giuseppe, are we missing something for fuse-overlayfs in the storage.conf?

rhatdan commented 3 years ago

Could you put the ignore_chown_errors flag in the /etc/containers/storage.conf, we might have a bug where we are ignoring the local rootless storage.conf.

Does podman info show the storage option?

sampie commented 3 years ago

I copied the storage from home folder to /etc/containers/storage.conf. The error did not change when running podman.

Here is the output of "podman info": host: arch: amd64 buildahVersion: 1.15.2 cgroupVersion: v1 conmon: package: 'conmon: /usr/libexec/podman/conmon' path: /usr/libexec/podman/conmon version: 'conmon version 2.0.20, commit: unknown' cpus: 12 distribution: distribution: ubuntu version: "20.10" eventLogger: file hostname: ariana idMappings: gidmap:

I wonder how info is showing uidmap when

cat /etc/subuid

sami:100000:5000

and

cat /etc/subgid

sami:100000:5000

I also get uid maps when I clear subgid and subuid files. Do I need somehow to apply the change? It appears like editing the files is not enough.

sampie commented 3 years ago

I did try rebooting the machine and now I am getting error message with podman info:

ERRO[0000] [graphdriver] prior storage driver overlay failed: kernel does not support overlay fs: 'overlay' is not supported over nfs at "/storage/sami/overlay": backing file system is unsupported for this graph driver Error: kernel does not support overlay fs: 'overlay' is not supported over nfs at "/storage/sami/overlay": backing file system is unsupported for this graph driver

rhatdan commented 3 years ago

Do you have fuse-overlayfs installed? Did you remove the storage.conf from your homedir?

sampie commented 3 years ago

I have fuse-overlayfs 1.0.0-1 package installed. I did not earlier remove storage.conf from home, I just coped it to /etc/containers.

But now I tried removing the file from home and then "podman info" started to work and gave:

host: arch: amd64 buildahVersion: 1.15.2 cgroupVersion: v1 conmon: package: 'conmon: /usr/libexec/podman/conmon' path: /usr/libexec/podman/conmon version: 'conmon version 2.0.20, commit: unknown' cpus: 12 distribution: distribution: ubuntu version: "20.10" eventLogger: file hostname: ariana idMappings: gidmap:

Now "podman run --userns=keep-id -it ubuntu /bin/bash" is saying: ERRO[0000] cannot find UID/GID for user sami: No subuid ranges found for user "sami" in /etc/subuid - check rootless mode in man pages. Error: chown /run/user/1000/containers/overlay-containers/654dca9b9f58ce5b26de324cdc01ede01c0a85e081de7acb41812c8924c7673e/userdata: invalid argument

rhatdan commented 3 years ago

This is what I am seeing with podman-2.2 on Fedora 33.

$ grep ignore_chown_errors /etc/containers/storage.conf
# ignore_chown_errors can be set to allow a non privileged user running with
ignore_chown_errors = "true"
$ podman info | grep -i ignor
    overlay.ignore_chown_errors: "true"
sampie commented 3 years ago

These are my outputs for those commands on Ubuntu 20.10 having podman version 2.0.6+dfsg1-1ubuntu1

$ grep ignore_chown_errors /etc/containers/storage.conf
    ignore_chown_errors = "true"
$ podman info | grep -i ignor
$
rhatdan commented 3 years ago

ls -l ~/.config/containers/storage.conf

sampie commented 3 years ago

$ ls -l ~/.config/containers/storage.conf ls: cannot access '/home/sami/.config/containers/storage.conf': No such file or directory

vrothberg commented 3 years ago

Can you create and configure ~/.config/containers/storage.conf and try again? I recall a bug where rootless podman missed reading some options from the root storage.conf.

sampie commented 3 years ago

I copied the storage conf to home.

$ cat ~/.config/containers/storage.conf
[storage]
  graphroot = "/storage/sami/"
  [storage.options.overlay]
    ignore_chown_errors = "true"
$ podman info
ERRO[0000] 'overlay' is not supported over nfs at "/storage/sami/overlay" 
ERRO[0000] [graphdriver] prior storage driver overlay failed: kernel does not support overlay fs: 'overlay' is not supported over nfs at "/storage/sami/overlay": backing file system is unsupported for this graph driver 
Error: kernel does not support overlay fs: 'overlay' is not supported over nfs at "/storage/sami/overlay": backing file system is unsupported for this graph driver

I am beginning to wonder if the content of my storage.conf content is wrong. What should the content be? The error message clearly says that overlay is not supported, but the config has storage.options.overlay section.

rhatdan commented 3 years ago

Now I think this is our bug at this point. The issue is we are testing overlay against nfs and complaining, but I believe fuse-overlayfs should work here. Is this correct @giuseppe. We should change containers-storage to not fail on nfs if the storage was setup with a mount-program.

rhatdan commented 3 years ago

This looks like you don't have support for the mountsprogram.

Add mount_program = "/usr/bin/fuse-overlayfs" To your storage.conf

sampie commented 3 years ago
$ cat .config/containers/storage.conf
[storage]
  graphroot = "/storage/sami/"
  [storage.options.overlay]
    ignore_chown_errors = "true"
    mount_program = "/usr/bin/fuse-overlayfs"

And I have copied the same file to /etc/containers/

$ ls /usr/bin/fuse-overlayfs
/usr/bin/fuse-overlayfs
$ podman info
ERRO[0000] [graphdriver] prior storage driver overlay failed: kernel does not support overlay fs: 'overlay' is not supported over nfs at "/storage/sami/overlay": backing file system is unsupported for this graph driver 
Error: kernel does not support overlay fs: 'overlay' is not supported over nfs at "/storage/sami/overlay": backing file system is unsupported for this graph driver
$ findmnt /storage
TARGET   SOURCE                    FSTYPE OPTIONS
/storage 192.168.1.16:/mnt/storage nfs4   rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=192.168.1.22,local_lock=none,addr=192.168.1.16

I am using rather recently installed ubuntu 20.10 on kvm as my test system, so I guess these problems might be easily reproducible with a fresh 20.10 that can be run conveniently as a virtual machine.

rhatdan commented 3 years ago

Ok I got a replicator:

$ cat ~/.config/containers/storage.conf 
[storage]
  [storage.options.overlay]
    ignore_chown_errors = "true"
    mount_program = "/usr/bin/fuse-overlayfs"
$ podman info
Error: kernel does not support overlay fs: 'overlay' is not supported over extfs at "/home/dwalsh/.local/share/containers/storage/overlay": backing file system is unsupported for this graph driver

No NFS involved. This looks like we ignore the mount_program if it is set in storage.conf in the homedir.

rhatdan commented 3 years ago

Found the problem, and opened a PR in containers/storage.

sampie commented 3 years ago

While waiting until the fix enters distributions, is there a workaround?

rhatdan commented 3 years ago

Could you try:

[storage]
  driver="overlay"
  graphroot = "/storage/sami/"
  [storage.options.overlay]
    ignore_chown_errors = "true"
    mount_program = "/usr/bin/fuse-overlayfs"
rhatdan commented 3 years ago

This seems to work for me. Without the driver option, podman gets confused and does not read the options.

sampie commented 3 years ago
$ podman run -it -v /storage/share/:/share ubuntu /bin/bash
root@8614be8c15fa:/#
# touch /share/test
# ls -lhatr /share/test
-rw-r--r-- 1 root root 0 Dec  5 09:17 /share/test

Now outside of the container ls shows:

$ ls -lhatr /storage/share/test 
-rw-r--r-- 1 sami sami 0 joulu   5 11:17 /storage/share/test

This looks very good. Now it is starting to get interesting. Next I was going to see if I could build image as a user having a simple Dockerfile:

FROM ubuntu:20.10
RUN apt update
RUN apt install emacs-nox

Trying to build:

$ podman build -t myimage .
STEP 1: FROM ubuntu:20.10
STEP 2: RUN apt update
WARN[0000] exit status 1                                
ERRO[0000] container_linux.go:349: starting container process caused "process_linux.go:449: container init caused \"rootfs_linux.go:58: mounting \\\"devpts\\\" to rootfs \\\"/var/tmp/buildah576640740/mnt/rootfs\\\" at \\\"/var/tmp/buildah576640740/mnt/rootfs/dev/pts\\\" caused \\\"invalid argument\\\"\"" 
container_linux.go:349: starting container process caused "process_linux.go:449: container init caused \"rootfs_linux.go:58: mounting \\\"devpts\\\" to rootfs \\\"/var/tmp/buildah576640740/mnt/rootfs\\\" at \\\"/var/tmp/buildah576640740/mnt/rootfs/dev/pts\\\" caused \\\"invalid argument\\\"\""
error running container: error creating container for [/bin/sh -c apt update]: : exit status 1
Error: error building at STEP "RUN apt update": error while running runtime: exit status 1

Unfortunately this failed. Are some additional configuration parameters needed for podman to build images?

Alternatively, I evaluated if user could install programs in the running container:

$ podman run -it ubuntu /bin/bash
# apt update
E: setgroups 65534 failed - setgroups (1: Operation not permitted)
E: setegid 65534 failed - setegid (22: Invalid argument)
E: seteuid 100 failed - seteuid (22: Invalid argument)
E: setgroups 0 failed - setgroups (1: Operation not permitted)
Reading package lists... Done
W: chown to _apt:root of directory /var/lib/apt/lists/partial failed - SetupAPTPartialDirectory (22: Invalid argument)
W: chown to _apt:root of directory /var/lib/apt/lists/auxfiles failed - SetupAPTPartialDirectory (22: Invalid argument)
E: setgroups 65534 failed - setgroups (1: Operation not permitted)
E: setegid 65534 failed - setegid (22: Invalid argument)
E: seteuid 100 failed - seteuid (22: Invalid argument)
E: setgroups 0 failed - setgroups (1: Operation not permitted)
E: Method gave invalid 400 URI Failure message: Failed to setgroups - setgroups (1: Operation not permitted)
E: Method gave invalid 400 URI Failure message: Failed to setgroups - setgroups (1: Operation not permitted)
E: Method http has died unexpectedly!
E: Sub-process http returned an error code (112)
E: Method http has died unexpectedly!
E: Sub-process http returned an error code (112)

Unfortunately this failed also.

github-actions[bot] commented 3 years ago

A friendly reminder that this issue had no activity for 30 days.

rhatdan commented 3 years ago

This got a little lost during the winter break.

@giuseppe PTAL

giuseppe commented 3 years ago

NFS is not supported yet, it requires some features in fuse-overlayfs that are not part of a release yet, and also an updated kernel (>= 5.9).

I'll cut a new fuse-overlayfs release so we can start playing with it.

giuseppe commented 3 years ago

my bad, actually all the pieces we need are in fuse-overlayfs 1.3.

You need to configure the storage as:

    [storage]
      driver = "overlay"
      graphroot = "/storage/sami/"
      [storage.options]
        mount_program = "/usr/bin/fuse-overlayfs"
        mountopt = "xattr_permissions=2"
       [storage.options.overlay]
         force_mask = "0755"
         ignore_chown_errors = "true"

Please make sure you are using fuse-overlayfs 1.3, Linux >= 5.9 and the latest Podman release

sampie commented 3 years ago

I updated to Ubuntu 21.04. It has fuse-overlayfs 1.3 and I have mainline kernel 5.9. The podman is 2.0.6. Is the podman version too old?

podman run -it -v /storage/share/:/share ubuntu /bin/bash

Gives: Error: container_linux.go:349: starting container process caused "process_linux.go:449: container init caused \"write sysctl key net.ipv4.ping_group_range: write /proc/sys/net/ipv4/ping_group_range: invalid argument\"": OCI runtime error

giuseppe commented 3 years ago

yes, 2.0.6 is too old for this feature.

I've tested it with Podman 2.2.1

rhatdan commented 3 years ago

Error: container_linux.go:349: starting container process caused "process_linux.go:449: container init caused "write sysctl key net.ipv4.ping_group_range: write /proc/sys/net/ipv4/ping_group_range: invalid argument"": OCI runtime error

This error is not related to NFS Storage I believe.

Try CONTAINERS_CONF=/dev/null podman run -it -v /storage/share/:/share ubuntu /bin/bash

To confirm.

sampie commented 3 years ago

Looks like it is related to the configuration, but I still have podman version 2.0.6. I'll try to find out how to upgrade that on Ubuntu.

$ CONTAINERS_CONF=/dev/null podman run -it -v /storage/share/:/share ubuntu /bin/bash
ERRO[0000] cannot find UID/GID for user sami: No subuid ranges found for user "sami" in /etc/subuid - check rootless mode in man pages. 
root@7b89d020ef44:/#
sampie commented 3 years ago

I managed to upgrade to podman 2.2.1. Now I get this error:

$ podman run -it -v /storage/share/:/share ubuntu /bin/bash
Error: database storage graph root directory (graphroot) "/storage/sami/" does not match our storage graph root directory (graphroot) "/storage/sami": database configuration mismatch
rhatdan commented 3 years ago

Can you reset your storage or just remore the libpod database. Note all of your containers should be removed before you do this.

sampie commented 3 years ago

Sure. How do I reset the storage? Or is the database a single file somewhere I need to remove?

rhatdan commented 3 years ago

podman system reset.

sampie commented 3 years ago

It seems to give the same error:

$ podman system reset
Error: database storage graph root directory (graphroot) "/storage/sami/" does not match our storage graph root directory (graphroot) "/storage/sami": database configuration mismatch
rhatdan commented 3 years ago

What does podman info say about the graphroot?

sampie commented 3 years ago

Info says pretty much the same.

$ podman info
Error: database storage graph root directory (graphroot) "/storage/sami/" does not match our storage graph root directory (graphroot) "/storage/sami": database configuration mismatch
rhatdan commented 3 years ago

cat ~/.config/containers/storage.conf

sampie commented 3 years ago
$ cat ~/.config/containers/storage.conf
[storage]
      driver = "overlay"
      graphroot = "/storage/sami/"
      [storage.options]
        mount_program = "/usr/bin/fuse-overlayfs"
        mountopt = "xattr_permissions=2"
       [storage.options.overlay]
         force_mask = "0755"
         ignore_chown_errors = "true"
rhatdan commented 3 years ago

Any chance you could try this with podman-3.0

sampie commented 3 years ago

Sure, where I can find the deb packages?

rhatdan commented 3 years ago

Not released to Debian yet, should be in the next couple of weeks.

Podman 3.0 does a better job of using ~/.config/containers/storage.conf and I believe this bug is fixed there. Reopen if I am mistaken.

sampie commented 3 years ago

It seems there was a new podman version 3.0. I still seem to get errors, but now there is a new error complaining about Podman REST API service.

It looks like github does not allow me to reopen this issue as it was not me who closed the issue.

sami@ariana:~$ podman --version
podman version 3.0.0
sami@ariana:~$ podman run -it -v /storage/share/:/share ubuntu /bin/bash
Error: Cannot connect to the Podman socket, make sure there is a Podman REST API service running.: database storage graph root directory (graphroot) "/storage/sami/" does not match our storage graph root directory (graphroot) "/storage/sami": database configuration mismatch
sami@ariana:~$ podman system reset
Error: Cannot connect to the Podman socket, make sure there is a Podman REST API service running.: database storage graph root directory (graphroot) "/storage/sami/" does not match our storage graph root directory (graphroot) "/storage/sami": database configuration mismatch
mheon commented 3 years ago

I remember @giuseppe fixing an almost-identical problem, but I'm certain that fix was already in 3.0 - I guess this must be a slightly different case?

Regardless, as a workaround, you should be able to adjust graphroot in your storage.conf to not have a trailing /. Fix on our end is to clean the paths to ensure they are identically formatted.

sampie commented 3 years ago

I did just clean everything from the existing graphroot path and removed the trailing /, which seemed to help a bit.

Now, I get following errors with the current config.

$ podman run -it -v /storage/share/:/share ubuntu /bin/bash
Resolved "ubuntu" as an alias (/home/sami/.config/.cache/containers/short-name-aliases.conf)
Trying to pull docker.io/library/ubuntu:latest...
Getting image source signatures
Copying blob 83ee3a23efb7 done  
Copying blob f611acd52c6c done  
Copying blob db98fc6f11f0 done  
Copying config f63181f19b done  
Writing manifest to image destination
Storing signatures
  Error processing tar file(exit status 1): lsetxattr /: operation not supported
Error: Error committing the finished image: error adding layer with blob "sha256:83ee3a23efb7c75849515a6d46551c608b255d8402a4d3753752b88e0dc188fa": Error processing tar file(exit status 1): lsetxattr /: operation not supported
podman build -t myimage .
STEP 1: FROM ubuntu:20.10
Resolved "ubuntu" as an alias (/home/sami/.config/.cache/containers/short-name-aliases.conf)
Getting image source signatures
Copying blob 00c1d05f510d done  
Copying blob fdcab926b54c done  
Copying blob a09400eba642 done  
Copying config d6d4bee71a done  
Writing manifest to image destination
Storing signatures
Error: error creating build container: Error committing the finished image: error adding layer with blob "sha256:a09400eba642b8443b0196d54f94bd75bb9e0ca23fae24e945ce8ef12237f09e": Error processing tar file(exit status 1): lsetxattr /: operation not supported
$ cat .config/containers/storage.conf
[storage]
      driver = "overlay"
      graphroot = "/storage/sami"
      [storage.options]
        mount_program = "/usr/bin/fuse-overlayfs"
        mountopt = "xattr_permissions=2"
       [storage.options.overlay]
         force_mask = "0755"
         ignore_chown_errors = "true"