Open myw opened 3 years ago
Just made an even simpler reproducible test case entirely in bash:
bash-3.2$ docker run --rm --volume="$(pwd):/tmp" python:3 bash -c "
cd /tmp
rm -f testfile.tmp
touch testfile.tmp
stat --format='%a' testfile.tmp
[ -x testfile.tmp ] && echo 'access'"
644
access
Further testing:
Some base images do not exhibit this behavior: alpine
, busybox
, and cirros
, when running the equivalent test in sh
, exhibit correct behavior:
docker run --rm --volume="$(pwd):/tmp" cirros sh -c "
cd /tmp
rm -f testfile.tmp
touch testfile.tmp
stat -c '%a' testfile.tmp
[ -x testfile.tmp ] && echo 'access'"
644
Interestingly, python:alpine
allows us to test both python
and sh
on the same OS. When doing that, the [ -x ]
test in sh
works correctly, but the os.access
test in python
fails.
Likely, the version of sh
on the alpine, cirrus, and busybox distros works as expected with gRPC-FUSE, but Python and/or bash
/sh
on other systems do not.
More testing: opensuse/leap
with sh
: fails. opensuse/tumbleweed
with sh
: passes.
I see exactly the same problem when I try to compile netty :/
Seems that the distributions that do not fail this test mostly use busybox
, whose shell's access test function specifically mentions not "mak[ing] the mistake of telling root that any file is executable."
This makes me think that gRPC-FUSE is doing something where the access to file is being tested as root, which triggers a common edge-case behavior in the standard POSIX access
system call. This logic does seem to have been resolved in osxfs
, so there's probably a workable fix.
It saw this failing on centos...
Still exists as of 3.3.1.
Note that this does not depend on the user inside the docker container being root
.
docker run -u nobody --rm --volume="$(pwd):/tmp" debian sh -c "
cd /tmp
rm -f testfile.tmp
touch testfile.tmp
stat -c '%a' testfile.tmp
[ -x testfile.tmp ] && echo 'access'"
644
access
/cc @djs55
Issues go stale after 90 days of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
comment.
Stale issues will be closed after an additional 30 days of inactivity.
Prevent issues from auto-closing with an /lifecycle frozen
comment.
If this issue is safe to close now please do so.
Send feedback to Docker Community Slack channels #docker-for-mac or #docker-for-windows. /lifecycle stale
/remove-lifecycle stale
On Mon, Jul 26, 2021 at 9:00 PM docker-desktop-robot < @.***> wrote:
Issues go stale after 90 days of inactivity. Mark the issue as fresh with /remove-lifecycle stale comment. Stale issues will be closed after an additional 30 days of inactivity.
Prevent issues from auto-closing with an /lifecycle frozen comment.
If this issue is safe to close now please do so.
Send feedback to Docker Community Slack channels #docker-for-mac or
docker-for-windows.
/lifecycle stale
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/docker/for-mac/issues/5509#issuecomment-887129244, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABJNQUVACXS5ZR5JBI7UZDTZYAKJANCNFSM4Z2PVIXA .
-- Misha
Noticed this as well on macOS Big Sur (not using the new virt. framework), Docker Desktop Version 4.0.0 (4.0.0.12) with CentOS 6.10 image.
I think this is the same as #5029 and has the same root cause as https://github.com/docker/for-mac/issues/5944#issuecomment-912450810 . Quoting from there:
We use Linux FUSE to mount the host filesystem. There are 2 permission models: https://elixir.bootlin.com/linux/latest/source/fs/fuse/dir.c#L1197 . We delegate the permission checks to the server (by not setting default_permissions
) because we want to avoid the situation where a user has a writable file on the host but can't get Linux to write to it because Linux believes the file owner/group is different. In this mode access(path, W_OK)
will invoke the fuse_access
API https://elixir.bootlin.com/linux/latest/source/fs/fuse/dir.c#L1160 .
However we also care about performance. Many filesystem options performed by Linux will be prefixed by a fuse_access
call, which doubles the numbers of RPCs to the host in these workloads. These access
results are really just hints, as the real access check has to be done within the write
/ open
/ rm
call, since the permissions can change between the calls. Therefore we also disable FUSE_ACCESS
on the server side by returning ENOSYS
so no_access
is set here https://elixir.bootlin.com/linux/latest/source/fs/fuse/dir.c#L1168 . The result is that Linux assumes access
returns success, then (hopefully) tries the real operation to see whether it succeeds or not. The host does the access control against the real file permissions.
So it's a combination of
results in the inaccuracy of the access
call. Fixing this is possible, but it would reduce performance.
@djs55 Fascinating and incredibly helpful context. Thank you!
I am not familiar with the filesystem management on that level, but your proposal for the root cause makes sense to me—I haven't observed any behavior that would contradict it.
I do think that this behavior is far enough outside the bounds of expectation that it should be possible to disable it without disabling all of gRPC-FUSE (maybe with a config-file-only setting?), which would provide most of the existing benefits and presumably still offer some performance benefit, even with the extra access calls. But whether or not it's worth it to work on a fix like that would depend on the performance impact tradeoff.
Conversely, would it somehow be possible to disable the fuse_access
call only when we know it's coming as an access hint from the Linux filesystem that's about to be immediately followed by a write
/open
/rm
call? That is, if we know it's something like python
code making the call explicitly from userspace, rather than the filesystem itself checking, could we let the call go through? I doubt the answer is yes, but I think it would effectively resolve the issue with less performance impact.
Finally, for the sake of any others others following this this post, I do also want to share the two workarounds you mentioned in the rest of that comment that do not involve turning off gRPC-FUSE, which might be helpful in some use-cases:
… if you would like 100% native Linux access control checks, you can store your data in a "named volume" which resides inside the Linux filesystem. For example:
docker volume create my-code docker run -v my-code:/mnt alpine ls /mnt
Another possibility is to use "dev environments" https://docs.docker.com/desktop/dev-environments/ which store the code in Linux (so 100% native filesystem semantics) while also allowing you to seamlessly access everything from your IDE (as well as push/pull the environment to share it with colleagues etc)
In addition to these workarounds, I'm wondering if there's any chance using the new BigSur virtualization framework could either resolve the issue, or otherwise improve performance to mitigate the impact of fixing it?
Thanks again for looking into this.
@myw thanks for the quick reply! For what it's worth, I'm not satisfied with the current state either. There are some improvements coming in macOS Monterey in the virtualization.framework
which may help speed things up and improve the semantics of access
: we're investigating those. We'll let you know if/when we have something interesting to try.
I came here after finding #5007.
This impacts the database initialisation scripts I want to use with postgres (& mysql) images.
I have some scripts that I want sourced by the initialisation instead of executed.
However, because of this issue, the entrypoint script tries to execute the scripts (because -x
file test succeeds) which fails (with permission denied) because the execute bit is not actually set.
Issues go stale after 90 days of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
comment.
Stale issues will be closed after an additional 30 days of inactivity.
Prevent issues from auto-closing with an /lifecycle frozen
comment.
If this issue is safe to close now please do so.
Send feedback to Docker Community Slack channels #docker-for-mac or #docker-for-windows. /lifecycle stale
/remove-lifecycle stale
/lifecycle frozen
This still exists in 4.34.3 (170107) Engine 27.2.0 on Mac OS X 15.0.1 (24A348)
Surprised this bug has been in existence for so long - [ -x plainFile ]
on a plain, non executable file on volume mounted inside a container returns true
(exit 0).
The difference between the alpine
and other containers is that alpine
containers actually copy the volume so it is not shared and therefore behaves correctly. You can detect this by simply adding a file inside the container and seeing if the local copy changes. However, ubuntu
container can replicate this bug easily. Simple case October 2024:
mkdir test
touch test/notx.md
docker run -v "$(pwd)/test:/root/test" -it ubuntu:latest
root@5948a905ab89:/# cd /root/test
root@5948a905ab89:~/test# ls -la
total 4
drwxr-xr-x 3 root root 96 Oct 23 14:10 .
drwx------ 1 root root 4096 Oct 23 14:11 ..
-rw-r--r-- 1 root root 0 Oct 23 14:10 notx.md
root@5948a905ab89:~/test# if [ -x notx.md ]; then echo "is executable"; else echo "works correctly"; fi
is executable
root@5948a905ab89:~/test#
Is there any reason this can't be fixed? Seems like a pretty major issue.
You can workaround this by simply not using mounted volumes and copy your files into the target container but this means you lose development speed.
61A7AFC6-411E-4184-B8AE-A79CA0239084/20210326003855
Summary
gRPC-FUSE volumes seem to be incorrectly reporting some permissions. Namely,
python2.7
seems to think non-executable files are executable, when mounted via gRPC-FUSE volumes. I present a minimal test case below.Expected behavior
A host directory is mounted inside my container with a bind mount. When I test the access of non-executable file in that directory with Python, I expect it to tell me that it is non-executable. i.e. if
stat
tells me file foo has mode0644
,os.access('foo', os.X_OK)
should returnFalse
.When I try this with gRPC-FUSE turned off, this is what happens.
Actual behavior
When gRPC-FUSE is enabled,
os.access('foo', os.X_OK)
returnsTrue
, even though the file has mode0644
.Information
This is quite reproducible.
A minimal test case to highlight the issue is described below. For the sake of brevity, I am not posting further detailed examples, bit I have also verified that the erroneous behavior happens with a non-root user in the container. I have not tested with
python3
or with other python2 base images.Steps to reproduce the behavior
0. Control: Show that in a container without a bind mount,
python
correctly identifies a file with mode0644
as non-executable.Because the result of the Python expression is
False
,python
correctly identifies that it does not have execute permissions on the file. This is the expected behavior and is true regardless of whether or not Use gRPC FUSE for file sharing is enabled, because the file is not on a bind mount.1. Expected Behavior With Use gRPC FUSE for file sharing DISABLED, run the same code as above, but have the file be on a bind mount. Note the addition of
--volume="$(pwd):/tmp"
is the only change to the command.The result is the same as the control: the expected behavior.
2. Actual Behavior Now, ENABLE Use gRPC FUSE for file sharing, and run the exact same code as in 1. above:
Now, even though Python correctly sees the mode of the file,
os.access
incorrectly returnsTrue
. One consequence of this behavior is thatnosetests
ignores all files by default because it thinks they are executable.Happy to provide additional information to help debug.
Thanks!