actions / runner-container-hooks

Runner Container Hooks for GitHub Actions
MIT License
67 stars 43 forks source link

Permission denied when trying to create folder /__t on child container #73

Closed SirensOfTitan closed 9 months ago

SirensOfTitan commented 1 year ago

When running something that relies on hostedtoolcache as a non-root user, like setup-python, I run into the following issue:

Run actions/setup-python@v4
Run '/runner/k[8](...)s/index.js'
Version 3.[10](...) was not found in the local cache
Version 3.10 is available for downloading
Download from "https://github.com/actions/python-versions/releases/download/3.10.10-4[12](...)6486420/python-3.10.10-linux-22.04-x64.tar.gz"
Extract downloaded archive
/usr/bin/tar xz --warning=no-unknown-keyword -C /__w/_temp/aabf1bba-fd5b-4bed-91fe-d0b[14](...)8666e00 -f /__w/_temp/28d56371-4e7d-4f0e-8259-7cf690c0932b
Execute installation script
Check if Python hostedtoolcache folder exist...
Creating Python hostedtoolcache folder...
Error: mkdir: cannot create directory '/__t': Permission denied
Error: The process '/usr/bin/bash' failed with exit code 1
Error: Error: failed to run script step: command terminated with non-zero exit code: Error executing in Docker Container: 1
Error: Process completed with exit code 1.
Error: Executing the custom container implementation failed. Please contact your self hosted runner administrator.

The work directory is there and looks alright at __w, there is also a __e directory present that seems pertinent.

nikola-jokic commented 1 year ago

Hey @SirensOfTitan,

Can you please describe your setup? Based on the log, I assume you are running runner in k8s cluster inside the docker image and using docker hook. If that is the case, can you check if the user inside the job container has the same UID as the runner user? This permission denied can happen in case your runner user is running as UID 1001, and you have a user inside the job container having UID different from 1001. Then, since /__t is mounted from the runner, it inherits the permission on the host. So either your job container user needs to be root, or it needs to match the UID of the runner user.

SirensOfTitan commented 1 year ago

Hi @nikola-jokic : Thank you for the speedy response!

Can you please describe your setup? Based on the log, I assume you are running runner in k8s cluster inside the docker image and using docker hook. If that is the case, can you check if the user inside the job container has the same UID as the runner user?

Yes, that's correct. I'm using an image based on ghcr.io/actions/actions-runner-controller/actions-runner:ubuntu-22.04 (source: https://github.com/actions/actions-runner-controller/blob/master/runner/actions-runner.ubuntu-22.04.dockerfile) with runner container hooks updated to 0.3.1 (that original one is set to 0.2.0 by default) as the container for the child pod in question and the default runner container image.

... as such, the runner has UID 1001 in both cases. In my case, the __t directory doesn't seem to exist at all in the child pod:

cd /t bash: cd: /t: No such file or directory

nikola-jokic commented 1 year ago

I'll check the volume mounts and get back to you as soon as I can :relaxed:

nikola-jokic commented 1 year ago

Hey @SirensOfTitan,

I think I found the issue. Can you please share how are you using setup-python (the step and with parameters)? I investigated differences between hook and the runner suspecting there is a mount that the runner did not send. Since the container hook respects mounts sent from the runner, this can't be an issue related to the container hook.

Anyway, the check if the hostedtoolcache exists tells me that you are probably using caching. This mount is added by default on hosted runners (see setup-python docs), but the __t mount does not exist on the self-hosted runner. Then, the execution probably tries to create /__t but since it is not a root, it does not have permissions to create directories in the / path causing permission issue. :smile: Let me know if that is actually the issue. I tried comparing docker execution on self-hosted runner without the hook, and it does not come with the /__t mount as well

nikola-jokic commented 1 year ago

Here is the similar issue in the runner: https://github.com/actions/runner/issues/2522 Same problem with /opt/hostedtoolcache regardless of if the hook is being used :relaxed:

SirensOfTitan commented 1 year ago

@nikola-jokic: Thank you!

Here's the setup-python step:

      - uses: actions/setup-python@v4
        with:
          cache: "poetry"
          cache-dependency-path: "./poetry.lock"

We are indeed using caching! If it helps: we're on Enterprise Server 3.8 as well.

This mount is added by default on hosted runners (see setup-python docs), but the t mount does not exist on the self-hosted runner. Then, the execution probably tries to create /t but since it is not a root, it does not have permissions to create directories in the / path causing permission issue.

This is correct! The /t mount doesn't exist like e.g. the work folder at /w, so the execution tries to create and fails since it isn't root.

nikola-jokic commented 1 year ago

I'll leave this issue open here, but until the fix is ready in the runner. :relaxed: Thank you for reporting it!

nikola-jokic commented 9 months ago

Let's close this issue here since it should be fixed on the runner. Thank you again for reporting it!