Closed abitrolly closed 4 years ago
I assume the -u postgres
in here means that your app in the container isn't running as root?
As such, it's running as a non-0 user in the container, which is mapped to a user on the host through /etc/subuid
(root in a rootless container is the user that started the container, all higher UIDs and GIDs are mapped to a block on the host given by /etc/subuid
and /etc/subgid
). The volume looks like it's somewhere that's owned by the user starting the container - but you're running the app in the container as a different user, which means you run into permissions errors.
You may want to just remove -u postgres
and run as root if you need to access volumes owned by your user. Running as root in a rootless container is already very secure (the container has no added privileges that your user does not), so the only security benefit to swapping to another user in the container is preventing the container from accessing files owned by your user - which, in this case, you need (to talk to the volume).
@mheon -u
is a parameter for schemaspy
, the container itself defines USER java
, and podman
is run unprivileged without sudo
. I'd expect podman
to handle mount volumes transparently without leaking low level details about uid and filesystem mappings. Otherwise all scripts will need to contain -u root
, which doesn't looks very secure. )
We really can't handle this ourselves - these are separate users from the perspective of the kernel, and normal filesystem permissions apply.
It is possible to configure filesytem layer to ignore host permissions? If a container is already isolated through filesystem path, why impose additional uid
restrictions?
Maybe it is possible to implement two layer writes? The first layer enforces permissions, so that container won't escape the defined path, but final writes to disk are ignoring the permissions. If container needs a separate user with volume mapping, maybe podman
could switch to the double layer concept automatically.
You can change the ownership on the volume with the podman unshare chown UID PATH
@rhatdan PATH
is path in my directory, not /var/...
, right? How do I know UID? What will happen to filesystem permissions after I quit this modified user namespace
?
I also checked man podman-unshare
and the description sounds too low level. Maybe it is possible to modify it for people who are not familiar with cgroups
yet.
podman-unshare - Run a command inside of a modified user namespace.
I've got a thought overnight.
(root in a rootless container is the user that started the container, all higher UIDs and GIDs are mapped to a block on the host given by /etc/subuid and /etc/subgid)
When I run container with custom USER as non-privileged, then there is no root
inside anymore - is that right? If it is so, then why not to map that custom USER instead of root
to my UIDs and GIDs instead?
By default podman as non root, runs as root within the container. This means the processes in the container have full Namespaced Capabilities. This also means that if the container process escaped the container, it would have full access to files in your homedir (Based on UID, SELinux would still block it, but I have heard that some people disable SELinux). If you run the processes within the container as a different non root UID, then those processes will run as that UID and if they escape they would only have world access to content in your homedir.
@abitrolly I am writing a blog based on these issues. Send me your email and I can expose an early copy to you. dwalsh@redhat.com
Just needs to be reviewed and then I can get it published.
@rhatdan people disable SELinux, because not all builds scripts add :Z
suffix to volume mounts, without which volumes on SELinux do not work, and podman
doesn't add this suffix automatically.
My email is anatoli@rainforce.org. I sent email titled "Early copy" from this address.
Given that even with podman
unprivileged model escaped container can steal my private SSH keys, I don't think that podman
is more secure than docker
anymore. Private keys are more valuable than root level access to OS (which main risk is again - stealing private keys from more boxes). Now I think that double level filesystem access control is a must have feature for any non-priviliged process containers.
People running with containers not separated by SELinux are taking a big risk, since it is the main tool to protect their file system from containers.
Escape for Docker allows access to all keys, rootless podman only to the users uid. Running rootless containers in a different User Namespace would give you more protections.
@abitrolly But bottom line, I tell people to always run their containers as non-root, even in the rootless container. One thing we could consider would be to add a :U to volumes which would chown the directory to match the primary user of the container. For podman Might be something to consider.
Not sure if this is a "true" solution or more of a workaround, but would not --userns
handle at least some of the situations desired to mount with non-root user permissions?
For example:
podman run --rm --userns=keepid -v /home/hostUserName/tmp:/home/containerUserName/tmp:Z -it image_name /bin/bash
This mounts tmp inside the container at /home/containerUserName/tmp
with the same UID:GID inside the container as it possesses on the host.
Perhaps --userns=ns:my_namespace
could be used to mount a volume with the UID:GID corresponding to the user named my_namespace
?
Note: you cannot use --user myUserName
and --userns=...
in the same podman run ....
command, as I understand it.
This issue had no activity for 30 days. In the absence of activity or the "do-not-close" label, the issue will be automatically closed within 7 days.
One thing we could consider would be to add a :U to volumes which would chown the directory to match the primary user of the container.
@rhatdan, what's your take on this issue. Shall we pursue your upper proposal?
I don't think so, I am hesitant to make this more complicated. I think it is up to the user to set up the permissions correctly on the volume.
I still don't understand what unshare
does. How is that different from su <user>
? How does unshare
know the UID to run inside?
What is the proposed solution? Is the following correct?
podman unshare chown -R RLUID /host/path
podman run -v /host/path:/guest/path
- /guest/path
is now writablechown -R UID
to get permissions backIs that right?
Wow, this stuff is way too complicated. I've the same issue as @abitrolly (running podman as non-root, having a user inside the container that is not "root" and I cannot write to the mounted directory). I've read every comment here and I still don't have an idea how to make this work.
So, it seems like I can make it "work" be "chown"ing (on the host) the shared directory to the user-id that the non-root-container-user (in my case called "jenkins", because I'm using the jenkins:jenkins image from Docker hub) is mapped to on the host system. In my case, this jenkins-user from inside the container has the UID 559751 on the host system. (Btw, what is the easiest way to find this out?). So, doing sudo chown 559751 builds
on the host makes the directory writable to the user inside the container. But this has two big issues:
I need to share this image with my coworkers (who don't have root privileges), so both of these points are unacceptable.
I don't think so, I am hesitant to make this more complicated. I think it is up to the user to set up the permissions correctly on the volume.
Okay, but how?
Root should not be necessary - podman unshare
sticks you into the same user namespace that the rootless container uses, which gives you access to every UID/GID that the container does. Within a podman unshare
shell you should be able to chown folders/files owned by your user to the UID/GID used by Jenkins. You will need to know what IDs are in use inside the container, because podman unshare
is a shell on the host (though you can mount the container with podman mount
and inspect its /etc/passwd
to get those). This can also potentially allow you to identify the user we're mapped to on the host (su
to the right UID in the podman unshare
shell and touch a file in your /home
- the UID there should be the one in use).
For the second issue... That is definitely a concern, and one I don't think we have an easy solution to as yes. There is talk of adding UID/GID mappings to LDAP for use across multiple systems, but they will still be unique to the user running the container for security reasons, so not portable between users
Running rootless containers as non-root and mounting in volumes is proving to be quite complicated. I think a review of how things are right now and a discussion of how we can improve (maybe a blog?) is definitely warranted here.
@mheon Thank you very much for your detailed comment!
Root should not be necessary
I definitely needed "sudo" to execute sudo chown 559751 builds
on the host. This may be because the user accounts are centrally managed and there may be something wrong with my /etc/subuid
..? I remember that it was necessary for me to create this file manually some months ago. But this may be because my workstation is old and my Fedora installation has been upgraded many times over the last six years or so.
. Within a
podman unshare
shell you should be able to chown folders/files owned by your user to the UID/GID used by Jenkins.
Well, this seems portable in the sense that I should be able to write a simple shellscript to automate this process for my coworkers every time they want to use my image.
unshare
seems to be a strange name for this subcommand, but I probably just do not understand the deeper meaning of this. When browsing the podman-subcommands in an attempt to fix my issue I would've disregarded this subcommand immediately just because of its name. ("How does unsharing something helps me with those permission issues?")
I will tinker around with this some more tomorrow.
I agree that unshare
is a terrible name; we named it after an existing utility that enters user namespaces (doing something very similar to what we do, but not doing many of the things we do to make sure that it matches what other podman
commands are doing.
@ChristianCiach have you been able to come up with tutorial for your colleagues?
@abitrolly Sorry for the late reply. Yes, creating a simple wrapper that calls "podman unshare" before calling "podman run" works as expected. This is good enough for my use case.
Now how do you chown the directory back to the host user though?
$ mkdir tmp
$ podman unshare chown 1001:1001 tmp
$ ls -la tmp
total 0
drwxrwxr-x. 2 101000 101000 40 Jul 30 17:20 ./
drwxrwxrwt. 54 root root 2000 Jul 30 17:20 ../
/tmp
$ chown $(id -u):$(id -g) tmp
chown: changing ownership of 'tmp': Operation not permitted
It might not work in all use cases but another work around is to run the command in the container with the host's user ID and GUID by using --userns=keep-id --user=$(id -ur):$(id -gr)
, e.g.:
$ mkdir project
$ podman run -it --rm -v $PWD/project:/project:z --userns=keep-id --user=$(id -ur):$(id -gr) --entrypoint=/bin/bash quay.io/quarkus/ubi-quarkus-mandrel:20.1.0.1.Alpha2-java11 -c 'id; touch /project/lala'
uid=1000(1000) gid=1000 groups=1000
while without it it fails:
$ mkdir project
$ podman run -it --rm -v $PWD/project:/project:z --entrypoint=/bin/bash quay.io/quarkus/ubi-quarkus-mandrel:20.1.0.1.Alpha2-java11 -c 'id; touch /project/lala'
uid=1001(quarkus) gid=1001(quarkus) groups=1001(quarkus)
touch: cannot touch '/project/lala': Permission denied
I wonder if giving the container host user ID and GUID makes the contaner unprivileged?
Containers by default are unprivileged. (Depending on your definition of unprivileged) Running with --keep-id just changes the way the User Namespace is setup, It does not change the security controls on the container. The only difference is instead of the users UID being Root inside of the container, the User UID is the Users UID inside of the container, and the first UID listed for the user in the /etc/subuid files user mappings is UID=0 inside of the container.
i'm battling this same thing. i am using a bitnami image of postgresql from docker hub. it has a baked in user id of 1001. on my arch linux system my uid is 1000. I would like to make a directory in my home directory for postgres to persist its data to, and be able to poke around in that directory without having to chown it all the time when i want to. @rhatdan what is your suggestion for people using vendor provided images that already have a uid baked in?
Well for now you can do
$ podman unshare chown 1001:1001 PATHTODIR
We could add something to the volume command to do this, but I am not sure how ugly the syntax would be.
Well for now you can do
$ podman unshare chown 1001:1001 PATHTODIR
We could add something to the volume command to do this, but I am not sure how ugly the syntax would be.
I would love to have this feature, even if it looks ugly :)
Thanks for writing this up, it really helped me to understand what is going on here.
Unfortunately the container I am using and want to deploy and regularly update has 43 different user, and their associated group relationships. So If I understand the situation I would need to parse out all 43 entries from /etc/passwd using podman mount, then create a wrapper script that calls podman unshare with each of those. Then when the container gets updated to add a new user, I'm then broken and need to go update the script.
I know it would be complex from an implementation perspective, but it would be great if podman could inspect /etc/passwd itself from within the image and pull out the appropriate non-root users all within the --volume command option without a need for further user options.
Why not mount /home into the container then?
I would appreciate if podman
could just use network for mounting local volumes without all the complexity of multilevel filesystem->OS->SELInux->container->OS->filesystem permissions.
How hard it could be to write a nice blog post series for the simple users which follows the Podman mantra of not running root-full containers, but wants just to mount some volumes. For the most common use-cases. Like: 1) I, as a simple user want to run rootless PHP container and to keep working on my PHP code 2) I, as a simple user want to share volume between separate Nginx and PHP containers and want to keep working on the codebase. 3) I, as a simple user want to spin up some MariaDB instance with database being persisted on filesystem. 4) Etc, etc...
IMO there is not so much different flavors of the setups which are used by the mainstream users.
Not everyone works on Go/Rust single binaries which most often does not require volumes. Not everybody is working in the CI/CD environment which constantly does full baking. Not everyone wants to run podman build/run
on every single line added to the code.
This volumes question is like Top 1 asked question and yet, in Podman (which "coined" the root-less idioms) website can't be found any article for "simple users". Every article i saw is like - "Guys, in order to run your quick container idea in rootless, please first learn whole SELinux labeling and type enforcement, then learn Linux namespacing and then feel free to read this article and to run your container in rootless."
I understand that this is fairly complex topic. And the tooling is tailored to match pretty low level requirements and thus it is kinda flexible and complex at the same time. So... there is no need for high level user friendly API implementations which could take ages to implement. All it takes, just to write canonical blog series about the most common setups. I am not expert, to write 100% accurate articles, but there are people who are. And those people are wasting their valuable time in answering the individual issues in the GitHub instead of writing single canonical user reference which could be updated as API changes.
Thanks @rhatdan for the excellent article series where you explain things like podman unshare
(https://www.redhat.com/sysadmin/rootless-podman-makes-sense).
This works fine for a single container trying to access a host folder, but how about two containers which potentially use two different users both trying to access the same host folder (e.g. one writing, the other reading files)?
Well they should be in the same group or one have root ownership and the other group read access. Using --group-add keep-groups.
I am having a similar problem that is described here, but the solutions proposed do not work for my use case, because I am running two rootless containers in a pod. I hope I am not doing something silly, but this is the minimum example to reproduce the problem I see:
Note:
Both containers are run by non-root user tom
on the host (UID=1005).
Container 1 (influxdb) runs its process in the container by default as root
(UID=0).
Container 2 (telegraf) runs its process in the container by default as non-root user telegraf
(UID=999).
1) Create a named volume:
podman volume create influxdb_volume
2) Create a new pod:
podman pod create --name monitoring_pod --publish 8086:8086
3) Create a container in the pod and mount in the named volume:
podman run -d --rm \
--name influxdb_container \
--pod monitoring_pod \
--mount type=volume,source=influxdb_volume,destination=/var/lib/influxdb \
influxdb:1.8
4) On the host, I can see that a directory in the volume is owned by the non-root host user:
ls -ltr /home/tom/.local/share/containers/storage/volumes/influxdb_volume/_data/data/
drwx------ 4 tom tom 36 Sep 13 12:14 _internal
5) Create the second container, and bind mount in one of the volume's directories:
podman run -d --rm \
--name telegraf_container \
--pod monitoring_pod \
--mount type=bind,src=/home/tom/telegraf.conf,dst=/etc/telegraf/telegraf.conf \
--mount type=bind,src=/home/tom/.local/share/containers/storage/volumes/influxdb_volume/_data/data,dst=/home/influxdb_data \
telegraf
6) I can now see that the bind-mounted directory inside the container is owned by root (because the owner non-root host user is mapped to root inside the container):
podman exec --user telegraf telegraf_container ls -ltr /home/influxdb_data
drwx------ 4 root root 36 Sep 13 11:14 _internal
7) Trying to access the directory from inside the container fails, because container runs as non-root user (telegraf, UID=999), and directory owned by container root:
podman exec --user telegraf telegraf_container du -s /home/influxdb_data
du: cannot read directory '/home/influxdb_data/_internal': Permission denied
What would be a good way to solve this?
I cannot use unshare
to change the owner on the host, because this would mess up the influxdb container (who performs some actions as a different user influxdb
sometimes, not always root).
I also cannot use --userns=keep-id
at the container level (it gives an error "--userns and --pod cannot be set together"), and setting it at the pod level messes up the behaviour of the first container.
Does --userns=keep-id when creating the pod work?
podman pod create --userns=keep-id --name monitoring_pod --publish 8086:8086
@rhatdan No unfortunately not. If I pass the keep-id flag into the pod creation at step 2 like you suggested, then in step 3 the container fails to initialize, giving the following error:
run: open server: open tsdb store: mkdir /var/lib/influxdb/data/telegraf/_series: permission denied
Like I said before, setting it at the pod level messes up the behaviour of the first container.
Please open a new issue for this and we can get others to comment, we are trying to prevent people from discussing issues on older issues.
/kind bug
Description
When running a container with explicitly set user, such as https://github.com/schemaspy/schemaspy/blob/a28c9fc932cc6f85c7780050a678b3a3d7f595e9/Dockerfile#L44 the volume mounted by
podman
is not writeable.Steps to reproduce the issue:
Get some PostreSQL host (192.168.4.1) and port (5432)
Create dir to mount as a volume
Run
podman
Describe the results you received:
Container is unable to write to
/output
, most likely because it is running withjava
user.Describe the results you expected:
Volumes work
rw
regardless of user settings inside of conainer.Output of
podman version
: