Closed abitrolly closed 4 years ago
Not really sure what is going on. I attempted to run your container on Fedora 31, but I am not sure what goes into /src/snapcrafting/amend
In side of the container SELinux should be seen as disabled so that the tools running in the container should not be trying to do SELinux operations. I think the shutil.py package is grabbing all XAttrs that is sees and attempting to apply them. And the SELinux label of the container is not allowed.
shutil.py should become more aware of what it is doing or you need to disable SELinux separation.
podman run -ti --security-opt label=disable
It looks like shuti.copyxattr now has support for ignoring these errors as least on Fedora 31?
/usr/lib64/python3.7/shutil.py
"""Copy extended filesystem attributes from `src` to `dst`.
Overwrite existing attributes.
If `follow_symlinks` is false, symlinks won't be followed.
"""
try:
names = os.listxattr(src, follow_symlinks=follow_symlinks)
except OSError as e:
if e.errno not in (errno.ENOTSUP, errno.ENODATA, errno.EINVAL):
raise
return
for name in names:
try:
value = os.getxattr(src, name, follow_symlinks=follow_symlinks)
os.setxattr(dst, name, value, follow_symlinks=follow_symlinks)
except OSError as e:
if e.errno not in (errno.EPERM, errno.ENOTSUP, errno.ENODATA,
errno.EINVAL):
raise
I think the shutil.py package is grabbing all XAttrs that is sees and attempting to apply them. And the SELinux label of the container is not allowed.
I also think so. Why shutil.py
is able to read those host level XAttrs in the first place? Placing requirements on what software should and should not call is what I call leaky abstraction of containerization for users.
Inside container the Python used is /snap/snapcraft/current/usr/lib/python3.5
and that's not going to change soon, because it is Python of LTS release.
--security-opt label=disable
Help says Turn off label separation for the container
, but what does it exactly do?
It basically turns off SELinux labels, and runs container with an unconfined label. Does the shutil.py inside of the container have the same code that I attached?
No. It was modified 7 months ago and it is not the same in Python 3.5 branch.
https://github.com/python/cpython/blame/master/Lib/shutil.py
Well there is not much we can do other then have you disable SELinux separation until this is fixed. If this even works with SELinux separation turned off.
@rhatdan does --security-opt label=disable
mean that container will be able to read my SSH keys if I by mistake give it my home?
Yes. If it broke out to the file system, there would be nothing preventing the reading of these files if the process was running as your UID or as root.
@rhatdan what gives on this one?
This needs to be fixed in the python3 package, I don't see anything for us to do here.
podman
still needs filesystem isolation layer to be adopted by general public. I wouldn't bother writing this post long if it was no so critical. An ability to easily share work directory without reading a ton of info about UID/GID/SELinux/:x and :X prefixes/ACL like I did is critical for mass adoption. Otherwise from a public project it becomes a corporate policy.
@abitrolly Not sure what you mean but this, The same issue would happen in Docker with SELinux turned on. And it and Podman have been adopted by the "generate public".
Partially the reason I started to use podman
is to avoid problems with
docker
on SELinux enabled systems. The former can run unpriviliged
containers and is engineered from scratch to be awesome, but it doesn't fix
the problem with sharing volumes r/w. Maybe that's not the use case for
containers at all, and I better find resources to complete my plan9 to be
independent of problems with filesystems and SELinux.
On Wed, Feb 19, 2020, 12:21 AM Daniel J Walsh notifications@github.com wrote:
@abitrolly https://github.com/abitrolly Not sure what you mean but this, The same issue would happen in Docker with SELinux turned on. And it and Podman have been adopted by the "generate public".
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/containers/libpod/issues/4794?email_source=notifications&email_token=ACC72MZTPIT3WEFU7HEO3EDRDRGL7A5CNFSM4KDDAPMKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEMFC2FI#issuecomment-587869461, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACC72M2FGZE77EA44FDGL73RDRGL7ANCNFSM4KDDAPMA .
Well we can always drive python3 to fix the issue with shutils. Which they have a partial fix for. The bottom line is, the kernel prevents setting SELinux XAttrs on file systems that were mounted with the context mount, or on file systems that do not support SELinux labels. Since we are using fuse-overlay for rootless containers, we end up with this restriction. You could remove fuse-overlay and it would be allowed on a "vfs" driver, but that has a lot of headaches. At some point in the future, overlayfs might be allowed to be used by rootless user, which could also alleviate this problem. Until then Podman has to live with the limitations of what Rootless accounts provides.
I probably miss the point. Speaking user stories, as a user I don't
want to know how SELinux works with XAttrs, how vfs
and fuse-overlay
operate and why overlayfs
is worse or better to use podman
for writing
the results of running a container over my project dir (which is a Git
checkout).
That's why I think that it is easier to instruct container to mount volume over network, and provide the volume through 9p share (9pfs server). Then SELinux will deny 9pfs server from reading my .ssh/ files, but will not meddle with access to my project files.
On Wed, 19 Feb 2020 at 17:34, Daniel J Walsh notifications@github.com wrote:
Well we can always drive python3 to fix the issue with shutils. Which they have a partial fix for. The bottom line is, the kernel prevents setting SELinux XAttrs on file systems that were mounted with the context mount, or on file systems that do not support SELinux labels. Since we are using fuse-overlay for rootless containers, we end up with this restriction. You could remove fuse-overlay and it would be allowed on a "vfs" driver, but that has a lot of headaches. At some point in the future, overlayfs might be allowed to be used by rootless user, which could also alleviate this problem. Until then Podman has to live with the limitations of what Rootless accounts provides.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/containers/libpod/issues/4794?email_source=notifications&email_token=ACC72M47JBNVIDWECI7TLOLRDU7OFA5CNFSM4KDDAPMKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEMIDC3Y#issuecomment-588263791, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACC72M32MVYITR27J6JXAWDRDU7OFANCNFSM4KDDAPMA .
-- Anatoli Babenia
+1 (650) 605-3365 +375 (29) 320-4241
The filesystem isolation layer with 9pfs could also fix this top voted Docker problem https://github.com/moby/moby/issues/2259
Looks like I managed to screw up my system with podman
:z
flag - https://github.com/teemtee/tmt/issues/1179 I am certain that it is because of podman
, because it is the only thing I use that interacts with SELinux labels. Any hints how to fix that without disabling SELInux completely?
restorecon -v -R -F PATHTOBADDIR.
Why it doesn't work without -F
?
➜ ~ restorecon -v ~
/home/anatoli not reset as customized by admin to system_u:object_r:container_file_t:s0:c385,c528
How podman
manages to customize labels as admin if it is running rootless? Why it doesn't pay attention that mounted dirs are already have some labels that are not supposed to be overwritten?
Shoot, I thought that overroad customizable types.
You can hack this by doing two commands.
chcon -t user_home_t -R PATHTOBADDIR restorecon -v -R -F PATHTOBADDIR
/kind feature
Description
podman
+SELinux
= Leaky Abstraction. In my story containers provide isolated environment to help users concentrate on getting their app logic right and not think about low level details and permissions required to keep their systems stable and secure.That worked good with Docker + Ubuntu, except for the part that
docker
process is itself is run as root, and that made users like me feel uneasy.podman
appearance solved exactly this problem for me.What makes my think that
podman
on Fedora is a leaky abstraction is that after one year of trying to adoptpodman
into my workflow I know a lot of information that I don't need to know, and yet I am still not there.Yesterday I was able to solve the problem I started with a year ago thanks to the knowledge that I acquired about SELinux labelling, :z and :Z prefixes, USER and uid/gid mappings (that are not directly related to this issue, but learning them was necessary to remove unfitting pieces of different puzzle from my head). The problem is that python function
shutil.copystat
copies extended attributes and because container uses the same filesystem as host, SELinux denies this operation when you, for example,copystat
files from/bin
to keep them executable or read-only as before. There was a mistake in my volume mount command, which resulted in copying files from/bin
instead of from mounted/src/bin
. But I could figure it out only a year later when I did the same mistake.I thought that the problem is solved. I copy files for my build inside container only from
/src/bin
. But today I realized that problem is not solved, because the build system copies system installed libs to build subtree, andSELinux vs copystat
problem popped up again. I don't see who can I fix this, and I don't think that going down this rabbit hole is right way anyway.What I really want from
podman
is Filesystem isolation level where the filesystem in container is completely isolated from the host, and volumes are no different from any other isolated container dir. No filesystem operation from inside container should trigger SELinux or other additional filesystem or kernel drivers on the host, and hence no SELinux properties (or any other kernel drivers) should be visible in container. If volume on the host contains those labels, the modifications to these files and labels should be done as if those files are created and modified by any user level program, such asvim
.Steps to reproduce the issue:
[stage-packages]
Describe the results you received:
Describe the results you expected:
Additional information you deem important (e.g. issue happens only occasionally):
To keep the user story short - as a user of container I don't want to think about SELinux attributes on my host if my unprivileged container with a volume deep inside my home tries to play with some files.
The solution I tested for LXD on Ubuntu is to mount filesystem as 9p over the network through FUSE https://github.com/yakshaveinc/linux/issues/32 I don't have money to keep focus and make a proper solution out of it (add encryption and integrate with LXD), but as proof of concept it works.
Output of
podman version
: