Open chrisvanrun opened 4 days ago
Tried again after a fresh reinstall of my system. I'm on Ubuntu 24.04.1 LTS. Steps I took:
Initial run works. Second run stops after:
=+= Cleaning up any earlier output
Removed the -f argument from chmod. Now we see:
chmod: changing permissions of '.../test/output': Operation not permitted
Tried adding my user to the docker group to run docker as non-root and see if that fixes it
sudo usermod -aG docker $USER
Did not fix it…
Also tried taking ownership using chown
, but get the same permission issue.
Ok. So it fails on this part somewhere:
if [ -d "$OUTPUT_DIR" ]; then
# Ensure permissions are setup correctly
# This allows for the Docker user to write to this location
rm -rf "${OUTPUT_DIR}"/*
chmod -f o+rwx "$OUTPUT_DIR"
else
mkdir --mode=o+rwx "$OUTPUT_DIR"
fi
What are the permissions/ownership of the output directory right when you clone the repo, and what are they after the first run? ls -alh
IIRC
Oh and while we are at it, what does id -u
and id -g
return at your command line?
Oh and while we are at it, what does
id -u
andid -g
return at your command line?
both return 1000
output of id command:
uid=1000(thomas) gid=1000(thomas) groups=1000(thomas),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),100(users),114(lpadmin)
After clone I'm not sure, will check
after mkdir
I'm the owner (obviously, but I checked regardless)
drwxrwxrwx 2 thomas thomas 4096 Oct 18 14:37 output
after a successfull run, the owner is 100999:
drwxrwxrwx 2 100999 100999 4.0K Oct 18 14:38 output
Thanks for testing. Huh, 100999
is odd! Given your id is 1000
, I would expect that this (at the end of the test_run.sh) fixes things:
docker run --rm \
--quiet \
--env HOST_UID=`id -u` \
--env HOST_GID=`id -g` \
--volume "$OUTPUT_DIR":/output \
alpine:latest \
/bin/sh -c 'chown -R ${HOST_UID}:${HOST_GID} /output'
Well... When I remove that section, it actually works just fine? If I don´t run that section, the owner of the files in output/ is 100998 btw. But the owner of the output folder itself is still me.
~~I think it has something to do with this; but I don't fully understand it yet https://docs.docker.com/engine/security/userns-remap/~~
https://docs.docker.com/desktop/faqs/linuxfaqs/#how-do-i-enable-file-sharing
In this scenario if a shared file is chowned inside a Docker Desktop container owned by a user with a UID of 1000, it shows up on the host as owned by a user with a UID of 100999. This has the unfortunate side effect of preventing easy access to such a file on the host. The problem is resolved by creating a group with the new GID and adding our user to it, or by setting a recursive ACL (see setfacl(1)) for folders shared with the Docker Desktop VM.
Current suspect: with set -e
enabled, if a docker run has an error, the permission setter is not run. Should catch it in an exit so it always runs.
disabling set -e
does not change anything on my side. permission issue persists.
@koopmant could you try a test_run.sh with this:
#!/usr/bin/env bash
# Stop at first error
set -e
SCRIPT_DIR=$( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )
DOCKER_TAG="example-algorithm"
DOCKER_NOOP_VOLUME="${DOCKER_TAG}-volume"
INPUT_DIR="${SCRIPT_DIR}/test/input"
OUTPUT_DIR="${SCRIPT_DIR}/test/output"
cleanup() {
echo "=+= Cleaning up ..."
# Ensure permissions are set correctly on the output
# This allows the host user (e.g. you) to access and handle these files
docker run --rm \
--quiet \
--env HOST_UID=`id -u` \
--env HOST_GID=`id -g` \
--volume "$OUTPUT_DIR":/output \
alpine:latest \
/bin/sh -c 'chown -R ${HOST_UID}:${HOST_GID} /output'
}
trap cleanup EXIT
if [ -d "$OUTPUT_DIR" ]; then
# Ensure permissions are setup correctly
# This allows for the Docker user to write to this location
cleanup
rm -rf "${OUTPUT_DIR}"/*
chmod -f o+rwx "$OUTPUT_DIR"
else
mkdir -m o+rwx "$OUTPUT_DIR"
fi
echo "=+= (Re)build the container"
docker build "$SCRIPT_DIR" \
--platform=linux/amd64 \
--tag $DOCKER_TAG 2>&1
echo "=+= Doing a forward pass"
## Note the extra arguments that are passed here:
# '--network none'
# entails there is no internet connection
# '--volume <NAME>:/tmp'
# is added because on Grand Challenge this directory cannot be used to store permanent files
docker volume create "$DOCKER_NOOP_VOLUME" > /dev/null
docker run --rm \
--platform=linux/amd64 \
--network none \
--volume "$INPUT_DIR":/input:ro \
--volume "$OUTPUT_DIR":/output \
--volume "$DOCKER_NOOP_VOLUME":/tmp \
$DOCKER_TAG
docker volume rm "$DOCKER_NOOP_VOLUME" > /dev/null
echo "=+= Wrote results to ${OUTPUT_DIR}"
echo "=+= Save this image for uploading via ./save.sh"
output on second run:
=+= Cleaning up ... =+= Cleaning up ...
Was that all the output? Not sure if it worked from your comment. Sorry.
Sorry, that was not very clear. Yes, that's all the output; so it's not working. It runs cleanup first, then runs into the permission error again during chmod (leading to immediate exit) and then runs cleanup again on exit.
No problem! @amickan also ran into a similar looking problem. Hower above solution worked for her. Now I think it's possible that was from a different origin (the set -e
origin). Yours likely stems from a slightly different ACL setup on Ubuntu 24.
Albeit if you temp fix it via sudo, forcing ownership to your own user, does it return running the test_run.sh twice?
I am hesitant to add calls like this, because it really kicks the interoperability of the script down a few nodges.
# Set the ACL recursively (-R) for the specified user and group
setfacl -R -m u:$USER:rwx -m g:$GROUP:rwx $DIRECTORY
# Set default ACL for future files and directories (so new files inherit the ACL)
setfacl -R -d -m u:$USER:rwx -m g:$GROUP:rwx $DIRECTORY
Albeit if you temp fix it via sudo, forcing ownership to your own user, does it return running the test_run.sh twice?
Sorry, I don't quite understand what you mean.
FWIW, if you change this line:
/bin/sh -c 'chown -R ${HOST_UID}:${HOST_GID} /output'
to
/bin/sh -c 'chown -R ${HOST_UID}:${HOST_GID} /output/*'
the issue of running the script again is fixed. On my system, the ownership is then still on another user id, so I cannot open the file without providing root authentication; but this should still work on other systems.
the line
chmod o+rwx "$OUTPUT_DIR"
is then unneccessary, because the directory then never changed ownership. (But it doesn´t raise an issue.)
Sorry, let me clarify: if you fix ownership once. Does the proposed test_run.sh then keep working, even if ran multiple times?
The initial chmod is meant to ensure that the Docker's internal user can actually write to the output directory in the first place. The second chown is to ensure that the host user has full access to the files and is allowed to change any and all permission on it.
Sorry, let me clarify: if you fix ownership once. Does the proposed test_run.sh then keep working, even if ran multiple times?
Yes, that sort of works. The chown attempted in the last docker command then fails with a permission error. So I remain the owner of the directory and the rest of the script is successfull. Still, that means the current user does not become owner of the files in the output directory. But at least the script runs and creates output.
During the workshop run, @koopmant ran into the problem that running the test script twice resulted in permission errors.
The test_run.sh has a secondary docker run, introduced to specifically target these permissions. Root user v.s. build created user. Hence, it is unclear why this occurred. It's being investigated.