Open tschaffter opened 10 months ago
This workflow step takes slightly less than 2 minutes based on the runtime of past runs.
The Dev Container CLI performs the following operations:
devcontainer.json
Option to reduce the start up time:
This step takes 4 seconds, which is not much. It's also probably safer to shutdown the container properly.
Run df -h
Filesystem Size Used Avail Use% Mounted on
/dev/root 84G 62G 22G 75% /
tmpfs 3.4G 172K 3.4G 1% /dev/shm
tmpfs 1.4G 1.1M 1.4G 1% /run
tmpfs 5.0M 0 5.0M 0% /run/lock
/dev/sda15 105M 6.1M 99M 6% /boot/efi
/dev/sdb1 14G 4.1G 9.0G 31% /mnt
tmpfs 694M 12K 694M 1% /run/user/1001
The CPU and RAM information matches the following GH runner, though we seem to have access to more storage.
Source: https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners
Information about large runners are available here.
Observation:
Questions:
I thought that the dev container CLI was installing the VS Code extensions specified in devcontainer.json
when running devcontainer up
. However, there is nothing in the logs that shows that extensions are installed. Remove the extensions from the definition file also does not seem to speed up devcontainer up
.
Cache Docker layers says:
Run actions/cache@v3
with:
path: /tmp/.buildx-cache
key: Linux-single-buildx-82425368edcaf7acea638514ddd0cdc3809229bd
restore-keys: Linux-single-buildx
enableCrossOsArchive: false
fail-on-cache-miss: false
lookup-only: false
env:
NX_BRANCH: 1976
NX_RUN_GROUP: 5881219943
NX_CLOUD_AUTH_TOKEN:
NX_CLOUD_ENCRYPTION_KEY:
NX_CLOUD_ENV_NAME: linux
NX_BASE: df9fdd5b6b76b1a367103020ac738406a403c7e3
NX_HEAD: d6b57f4dbbc84b5e77489f011e9527fb717a329e
Cache not found for input keys: Linux-single-buildx-82425368edcaf7acea638514ddd0cdc3809229bd, Linux-single-buildx
Post Cache Docker Layers says:
Warning: Path Validation Error: Path(s) specified in the action for caching do(es) not exist, hence no cache is being saved.
Run docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
vsc-sage-monorepo-2fdb6546816a84e4081081a3ee23e99475210d4401dc22f9e4d7344ae6bcd399-features-uid latest 384c241022f4 9 seconds ago 4.78GB
vsc-sage-monorepo-2fdb6546816a84e4081081a3ee23e99475210d4401dc22f9e4d7344ae6bcd399-features latest 1414fd24cbc7 38 seconds ago 4.06GB
The post caching step says:
Warning: EACCES: permission denied, scandir '/var/lib/docker'
Based on #1750, we can not cache docker images as long as we have projects that build images based on local images.
devcontainer up --cache-from /tmp/.buildx-cache --workspace-folder ../sage-monorepo
does not save data to /tmp/.buildx-cache
even if the folder is created beforehand.
/var/lib/docker/buildkit
Post docker cache says:
Warning: EACCES: permission denied, lstat '/var/lib/docker/buildkit'
We may need the new option --cache-to
recently added: https://github.com/devcontainers/cli/pull/570
I will resume working on caching the dev container image(s) when the dev container CLI v0.50.3 is released as it should include the option --cache-to
.
This was an attempt to make the GH runner run the entire job in the container with:
runs-on: ubuntu-latest
container:
image: ghcr.io/sage-bionetworks/sage-devcontainer:55645b0
options: --user root
~By default, GH runners are run as root
.~
The default user used by the runner is:
Run id
uid=1001(runner) gid=123(docker) groups=123(docker),4(adm),101(systemd-journal)
On the other hand, Sage Monorepo environment has been designed to be executed by a non-root user (vscode
). Trying to run the container as non-root (--user vscode
) results in the checkout job failing:
node:internal/fs/utils:345
throw err;
^
Error: EACCES: permission denied, open '/__w/_temp/_runner_file_commands/save_state_feaa9342-be15-443b-927e-e3115f27f843'
at Object.openSync (node:fs:585:3)
at Object.writeFileSync (node:fs:2170:35)
at Object.appendFileSync (node:fs:2232:6)
at Object.issueFileCommand (/__w/_actions/actions/checkout/v3/dist/index.js:2945:8)
at Object.saveState (/__w/_actions/actions/checkout/v3/dist/index.js:2862:31)
at Object.8647 (/__w/_actions/actions/checkout/v3/dist/index.js:2321:10)
at __nccwpck_require__ (/__w/_actions/actions/checkout/v3/dist/index.js:18251:43)
at Object.2565 (/__w/_actions/actions/checkout/v3/dist/index.js:146:34)
at __nccwpck_require__ (/__w/_actions/actions/checkout/v3/dist/index.js:18251:43)
at Object.9210 (/__w/_actions/actions/checkout/v3/dist/index.js:1141:36) {
errno: -13,
syscall: 'open',
code: 'EACCES',
path: '/__w/_temp/_runner_file_commands/save_state_feaa9342-be15-443b-927e-e3115f27f843'
}
See this threads:
The id of the user and the permission of the folder mounted by the runner that leads the checkout step to fails. This issue is summarized here.
steps:
- name: Check id
run: |
id
sudo ls -al /__w/_temp/
Run id
uid=1000(vscode) gid=1001(vscode) groups=1001(vscode),27(sudo),1000(docker)
total 24
drwxr-xr-x 5 1001 123 4096 Aug 17 03:55 .
drwxr-xr-x 6 1001 root 4096 Aug 17 03:54 ..
-rw-r--r-- 1 1001 123 27 Aug 17 03:55 27042907-d87a-4624-9b13-597a25316578.sh
drwxr-xr-x 2 1001 123 4096 Aug 17 03:55 _github_home
drwxr-xr-x 2 1001 123 4096 Aug 17 03:55 _github_workflow
drwxr-xr-x 2 1001 123 4096 Aug 17 03:55 _runner_file_commands
About the default runner user:
Checking out a repository using actions/checkout@v2 works for me, but only if I switch to a user with sufficient privileges for the default directory, for example root or 1001 (the user used by GitHub Actions):
I found that setting my containers to run as the same UID/GID as my GHA runner user on the host solved the issue.
Returning to the current implementation where we use the devcontainer CLI. The goal is to set up the yarn cache folder and share it with the dev container. The tricky part is the permission because the GH runner who owns the cache folder and the user that run in the dev container are different.
This is how the cache is usually setup:
- name: Get yarn cache directory path
id: yarn-cache-dir-path
run: echo "dir=$(yarn config get cacheFolder)" >> $GITHUB_OUTPUT
- uses: actions/cache@v3
id: yarn-cache # use this to check for `cache-hit` (`steps.yarn-cache.outputs.cache-hit != 'true'`)
with:
path: ${{ steps.yarn-cache-dir-path.outputs.dir }}
key: ${{ runner.os }}-yarn-${{ hashFiles('**/yarn.lock') }}
restore-keys: |
${{ runner.os }}-yarn-
We can't perform the first step because Yarn is not installed on the OS used by the GH runner. Instead, we have Yarn in the dev container. Running the command in the dev container returns:
vscode@34f14f659357:/workspaces/sage-monorepo$ yarn config get cacheFolder
/workspaces/sage-monorepo/.yarn/cache
Content of the cache folder shows that the files are owned runner:docker
.
Run ls -al /home/runner/work/sage-monorepo/sage-monorepo/.yarn/cache
total 374676
drwxr-xr-x 2 runner docker 311296 Aug 4 23:27 .
drwxr-xr-x 5 runner docker 4096 Aug 17 16:09 ..
-rw-r--r-- 1 runner docker 26 Jul 5 17:18 .gitignore
-rw-r--r-- 1 runner docker 4355 Jul 5 17:18 2-thenable-npm-1.0.0-3c202a902b-567cda6fb2.zip
-rw-r--r-- 1 runner docker 5659 Jul 20 00:43 @aashutoshrathi-word-wrap-npm-1.2.6-5b1d95e487-ada901b9e7.zip
-rw-r--r-- 1 runner docker 18024 Jul 5 17:18 @actions-exec-npm-1.1.1-90973d2f96-d976e66dd5.zip
-rw-r--r-- 1 runner docker 12407 Jul 7 22:04 @actions-github-npm-5.1.1-61d3d8cdac-2210bd7f8e.zip
...
Looking into the cache folder from within the container:
- name: ls yarn cache folder inside the dev container
run: |
devcontainer exec --workspace-folder ../sage-monorepo bash -c ". ./dev-env.sh \
&& ls -al /workspaces/sage-monorepo/.yarn/cache"
Output:
total 374676
drwxr-xr-x 2 vscode vscode 311296 Aug 4 23:27 .
drwxr-xr-x 5 vscode vscode 4096 Aug 17 16:20 ..
-rw-r--r-- 1 vscode vscode 4355 Jul 5 17:18 2-thenable-npm-1.0.0-3c202a902b-567cda6fb2.zip
-rw-r--r-- 1 vscode vscode 5659 Jul 20 00:43 @aashutoshrathi-word-wrap-npm-1.2.6-5b1d95e487-ada901b9e7.zip
-rw-r--r-- 1 vscode vscode 6283 Jul 5 17:20 abab-npm-2.0.6-2662fba7f0-6ffc1af4ff.zip
-rw-r--r-- 1 vscode vscode 2938 Jul 5 17:20 abbrev-npm-1.1.1-3659247eab-a4a97ec07d.zip
-rw-r--r-- 1 vscode vscode 25232 Jul 5 17:20 abort-controller-npm-3.0.0-2f3a9a2bcb-170bdba9b4.zip
-rw-r--r-- 1 vscode vscode 6503 Jul 5 17:20 accepts-npm-1.3.8-9a812371c9-50c43d32e7.zip
main
The caches created by actions/cache@v3
when running the CI workflow in a PR is not available to main
.
This makes sense since doing otherwise would enable a third-party person to affect the workflows running on main
(e.g. voluntary cache poisoning or degrading performance by filling up the cache (max 10 GB)).
I observed this as the workflow running for main
had the same cache key for Poetry as the key used when running the workflow for a PR, however the cache could not be found for main
. I rerun the workflow on main
and that time the cache was available. The cache was created during the first run on main
.
The remaining tasks are to review Nx cloud config, though I think that it's working as expected. The second task is the adoption of pnpm
, which should be more thoroughly tested.
Added to Sprint 23.10
What projects is this feature for?
No response
Description
Background
The CI workflow has recently been updated to use the Dev Container CLI to run tasks in the dev container. This ensures that the environment used by the CI workflow is the same as the development environment used by developers. This container also provides all the tools needed by the CI workflow and eliminates the need to maintain different versions of the tools in the CI workflow.
One drawback of the current implementation is that the dev container is not caching dependencies for Python, Java, Node.js, etc. This means that dependencies needs to be downloaded again from remote servers each time the CI workflow runs.
Initializing the dev container in the CI workflow is composed of two steps: 1) start the dev container and 2) run the command
workspace-install
. These two takes can take together up to 6 minutes.Goal
The goal of this ticket is to explore means to minimize the initialization of the dev container. The tasks are:
Anything else?
No response
Code of Conduct