nodejs / build

Better build and test infra for Node.
506 stars 165 forks source link

Out of disk space on dockerized shared hosts #2494

Closed richardlau closed 1 year ago

richardlau commented 3 years ago

Reported via Slack:

From @Trott :

Example

10:51:23  > git config remote.origin.url git@github.com:nodejs/node.git # timeout=10
10:51:23 ERROR: Error fetching remote repo 'origin'
10:51:23 hudson.plugins.git.GitException: Failed to fetch from git@github.com:nodejs/node.git
10:51:23  at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:996)
10:51:23  at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:1237)

Different example

11:01:46 collect2: fatal error: ld terminated with signal 9 [Killed]
11:01:46 compilation terminated.
11:01:46 cctest.target.mk:231: recipe for target '/home/iojs/build/workspace/node-test-commit-linux-> containered/out/Debug/cctest' failed
11:01:46 make[2]: *** [/home/iojs/build/workspace/node-test-commit-linux-containered/out/Debug/cctest] Error 1
11:01:46 make[2]: *** Waiting for unfinished jobs....
11:02:58 rm fa309b4689a758e9e8a16895fbbf2b4922a45c96.intermediate
11:02:58 Makefile:104: recipe for target 'node_g' failed
11:02:58 make[1]: *** [node_g] Error 2
11:02:58 Makefile:530: recipe for target 'build-ci' failed
11:02:58 make: *** [build-ci] Error 2

From @danielleadams :

I’m running into some issues with the pull-request jobs today (releasing 15.4.0). Different tests seem to be raising the same error with git, and they are inconsistently failing: https://ci.nodejs.org/job/node-test-commit-linux-containered/nodes=ubuntu1804_sharedlibs_debug_x64/23997/console https://ci.nodejs.org/job/node-test-commit-linux-containered/nodes=ubuntu1804_sharedlibs_withoutssl_x64/23997/console https://ci.nodejs.org/job/node-test-commit-linux-containered/nodes=ubuntu1804_sharedlibs_openssl110_x64/23999/console https://ci.nodejs.org/job/node-test-commit-linux-containered/nodes=ubuntu1804_sharedlibs_openssl111_x64/24008/ https://ci.nodejs.org/job/node-test-commit-linux-containered/nodes=ubuntu1804_sharedlibs_withoutssl_x64/24008/console https://ci.nodejs.org/job/node-test-commit-linux/nodes=alpine-latest-x64/38744/console Does anyone know how to address this?

richardlau commented 3 years ago

It looks like the disks are full. e.g. from https://ci.nodejs.org/job/node-test-commit-linux-containered/24011/nodes=ubuntu1804_sharedlibs_withoutssl_x64/console the relevant lines are:

18:51:23 Caused by: hudson.plugins.git.GitException: Command "git config remote.origin.url git@github.com:nodejs/node.git" returned status code 4:
18:51:23 stdout: 
18:51:23 stderr: error: failed to write new configuration file /home/iojs/build/workspace/node-test-commit-linux-containered/.git/config.lock
18:51:23 
...
18:51:23 FATAL: Unable to produce a script file
18:51:23 java.io.IOException: No space left on device
18:51:23    at java.io.UnixFileSystem.createFileExclusively(Native Method)
root@test-softlayer-ubuntu1804-docker-x64-1:/home/iojs# df -h .
Filesystem      Size  Used Avail Use% Mounted on
/dev/xvda2       99G   94G     0 100% /
root@test-softlayer-ubuntu1804-docker-x64-1:/home/iojs# du -hs test-softlayer-*
4.0G    test-softlayer-alpine310_container-x64-1
2.3G    test-softlayer-alpine311_container-x64-1
2.3G    test-softlayer-alpine312_container-x64-1
3.7G    test-softlayer-alpine39_container-x64-1
2.2G    test-softlayer-ubi81_container-x64-1
2.5G    test-softlayer-ubuntu1604_arm_cross_container-x64-1
6.6G    test-softlayer-ubuntu1804_arm_cross_container-x64-1
208M    test-softlayer-ubuntu1804_container-x64-1
2.1G    test-softlayer-ubuntu1804_sharedlibs_container-x64-1
16G test-softlayer-ubuntu1804_sharedlibs_container-x64-2
14G test-softlayer-ubuntu1804_sharedlibs_container-x64-3
2.3G    test-softlayer-ubuntu1804_sharedlibs_container-x64-4
2.5G    test-softlayer-ubuntu1804_sharedlibs_container-x64-5
root@test-softlayer-ubuntu1804-docker-x64-1:/home/iojs#

I'll try removing the 14G/16G workspaces.

richardlau commented 3 years ago

I've cleaned up the workspaces for test-softlayer-ubuntu1804_sharedlibs_container-x64-2 and test-softlayer-ubuntu1804_sharedlibs_container-x64-3 to free up 30G of space.

rvagg commented 3 years ago

feel free to clean out all of the workspaces, it's not a huge slowdown to recreate them here (like it is on the Pi hosts)

richardlau commented 3 years ago

Out of space again this morning:

root@test-softlayer-ubuntu1804-docker-x64-1:/home/iojs# df .
Filesystem     1K-blocks     Used Available Use% Mounted on
/dev/xvda2     102821812 97779460    582800 100% /
root@test-softlayer-ubuntu1804-docker-x64-1:/home/iojs# du -hs test-softlayer-*
4.0G    test-softlayer-alpine310_container-x64-1
1.6G    test-softlayer-alpine311_container-x64-1
2.3G    test-softlayer-alpine312_container-x64-1
3.7G    test-softlayer-alpine39_container-x64-1
2.2G    test-softlayer-ubi81_container-x64-1
2.5G    test-softlayer-ubuntu1604_arm_cross_container-x64-1
6.6G    test-softlayer-ubuntu1804_arm_cross_container-x64-1
208M    test-softlayer-ubuntu1804_container-x64-1
1.8G    test-softlayer-ubuntu1804_sharedlibs_container-x64-1
15G     test-softlayer-ubuntu1804_sharedlibs_container-x64-2
16G     test-softlayer-ubuntu1804_sharedlibs_container-x64-3
2.3G    test-softlayer-ubuntu1804_sharedlibs_container-x64-4
2.4G    test-softlayer-ubuntu1804_sharedlibs_container-x64-5
root@test-softlayer-ubuntu1804-docker-x64-1:/home/iojs#

So the 15G workpaces are back and look to be Debug builds. I don't know what the typical expected workspace size is -- whether the sizes have been creeping up slowly over time or if there's been a recent change to cause a jump.

FWIW

root@test-softlayer-ubuntu1804-docker-x64-1:/home/iojs/test-softlayer-ubuntu1804_sharedlibs_container-x64-3/build/workspace# du -hs *
16G     node-test-commit-linux-containered
4.0K    node-test-commit-linux-containered@tmp
root@test-softlayer-ubuntu1804-docker-x64-1:/home/iojs/test-softlayer-ubuntu1804_sharedlibs_container-x64-3/build/workspace# cd node-test-commit-linux-containered
root@test-softlayer-ubuntu1804-docker-x64-1:/home/iojs/test-softlayer-ubuntu1804_sharedlibs_container-x64-3/build/workspace/node-test-commit-linux-containered# du -hs *
120K    AUTHORS
4.0K    BSDmakefile
32K     BUILDING.md
56K     CHANGELOG.md
4.0K    CODE_OF_CONDUCT.md
4.0K    CONTRIBUTING.md
8.0K    GOVERNANCE.md
84K     LICENSE
48K     Makefile
32K     README.md
4.0K    SECURITY.md
4.0K    android-configure
1.5M    benchmark
4.0K    codecov.yml
20K     common.gypi
4.0K    config.gypi
4.0K    config.mk
4.0K    config.status
4.0K    configure
72K     configure.py
56K     configure.pyc
386M    deps
9.9M    doc
4.0K    env.properties
4.0K    glossary.md
72K     icu_config.gypi
2.9M    lib
0       node
52K     node.gyp
12K     node.gypi
0       node_g
16K     onboarding.md
15G     out
4.9M    src
46M     test
248K    test.tap
35M     tools
32K     vcbuild.bat
root@test-softlayer-ubuntu1804-docker-x64-1:/home/iojs/test-softlayer-ubuntu1804_sharedlibs_container-x64-3/build/workspace/node-test-commit-linux-containered# cd out/
root@test-softlayer-ubuntu1804-docker-x64-1:/home/iojs/test-softlayer-ubuntu1804_sharedlibs_container-x64-3/build/workspace/node-test-commit-linux-containered/out# du -hs *
14G     Debug
36K     Makefile
854M    Release
12K     cctest.target.mk
156K    deps
12K     embedtest.target.mk
8.0K    fuzz_env.target.mk
8.0K    fuzz_url.target.mk
240K    junit
40K     libnode.target.mk
12K     mkcodecache.target.mk
16K     node.target.mk
4.0K    node_dtrace_header.target.mk
4.0K    node_dtrace_provider.target.mk
4.0K    node_dtrace_ustack.target.mk
4.0K    node_etw.target.mk
12K     node_mksnapshot.target.mk
4.0K    node_text_start.target.mk
4.0K    overlapped-checker.target.mk
4.0K    specialize_node_d.target.mk
508K    tools
root@test-softlayer-ubuntu1804-docker-x64-1:/home/iojs/test-softlayer-ubuntu1804_sharedlibs_container-x64-3/build/workspace/node-test-commit-linux-containered/out# cd Debug/
root@test-softlayer-ubuntu1804-docker-x64-1:/home/iojs/test-softlayer-ubuntu1804_sharedlibs_container-x64-3/build/workspace/node-test-commit-linux-containered/out/Debug# du -hs *
1.7M    bytecode_builtins_list_generator
1.1G    cctest
1.1G    embedtest
34M     gen-regexp-special-case
15M     genccode
16M     icupkg
1.1G    mkcodecache
1.4G    mksnapshot
1.1G    node
1.1G    node_mksnapshot
154M    obj
62M     obj.host
6.5G    obj.target
9.9M    openssl-cli
16K     overlapped-checker
39M     torque
root@test-softlayer-ubuntu1804-docker-x64-1:/home/iojs/test-softlayer-ubuntu1804_sharedlibs_container-x64-3/build/workspace/node-test-commit-linux-containered/out/Debug# ls -alh
total 6.8G
drwxr-xr-x  6 iojs iojs 4.0K Dec  8 19:53 .
drwxr-xr-x  7 iojs iojs 4.0K Dec  8 20:03 ..
drwxr-xr-x  3 iojs iojs 4.0K Dec  8 19:50 .deps
-rwxr-xr-x  1 iojs iojs 1.7M Dec  8 19:38 bytecode_builtins_list_generator
-rwxr-xr-x  1 iojs iojs 1.1G Dec  8 19:53 cctest
-rwxr-xr-x  1 iojs iojs 1.1G Dec  8 19:53 embedtest
-rwxr-xr-x  1 iojs iojs  34M Dec  8 19:40 gen-regexp-special-case
-rwxr-xr-x  1 iojs iojs  15M Dec  8 19:38 genccode
-rwxr-xr-x  1 iojs iojs  16M Dec  8 19:38 icupkg
-rwxr-xr-x  1 iojs iojs 1.1G Dec  8 19:53 mkcodecache
-rwxr-xr-x  1 iojs iojs 1.4G Dec  8 19:50 mksnapshot
-rwxr-xr-x  1 iojs iojs 1.1G Dec  8 19:54 node
-rwxr-xr-x  1 iojs iojs 1.1G Dec  8 19:53 node_mksnapshot
drwxr-xr-x  3 iojs iojs 4.0K Dec  8 19:37 obj
drwxr-xr-x  6 iojs iojs 4.0K Dec  8 19:38 obj.host
drwxr-xr-x 39 iojs iojs 4.0K Dec  8 19:53 obj.target
-rwxr-xr-x  1 iojs iojs 9.9M Dec  8 19:38 openssl-cli
-rwxr-xr-x  1 iojs iojs  16K Dec  8 19:37 overlapped-checker
-rwxr-xr-x  1 iojs iojs  39M Dec  8 19:38 torque
root@test-softlayer-ubuntu1804-docker-x64-1:/home/iojs/test-softlayer-ubuntu1804_sharedlibs_container-x64-3/build/workspace/node-test-commit-linux-containered/out/Debug#
richardlau commented 3 years ago

I was waiting for the inflight https://ci.nodejs.org/job/node-test-commit-linux-containered/ jobs to complete before manually removing things but it looks like the two most recent builds passed. The current disk space usage with nothing running looks like this:

root@test-softlayer-ubuntu1804-docker-x64-1:/home/iojs# df -h .
Filesystem      Size  Used Avail Use% Mounted on
/dev/xvda2       99G   82G   13G  87% /
root@test-softlayer-ubuntu1804-docker-x64-1:/home/iojs# du -hs *
4.0G    test-softlayer-alpine310_container-x64-1
1.6G    test-softlayer-alpine311_container-x64-1
2.3G    test-softlayer-alpine312_container-x64-1
3.7G    test-softlayer-alpine39_container-x64-1
2.2G    test-softlayer-ubi81_container-x64-1
2.5G    test-softlayer-ubuntu1604_arm_cross_container-x64-1
6.6G    test-softlayer-ubuntu1804_arm_cross_container-x64-1
208M    test-softlayer-ubuntu1804_container-x64-1
1.8G    test-softlayer-ubuntu1804_sharedlibs_container-x64-1
2.3G    test-softlayer-ubuntu1804_sharedlibs_container-x64-2
16G     test-softlayer-ubuntu1804_sharedlibs_container-x64-3
2.3G    test-softlayer-ubuntu1804_sharedlibs_container-x64-4
2.4G    test-softlayer-ubuntu1804_sharedlibs_container-x64-5
root@test-softlayer-ubuntu1804-docker-x64-1:/home/iojs#

i.e. we only have one 16G (i.e. Debug build) workspace. We're probably seeing flaky behaviour if more than one https://ci.nodejs.org/job/node-test-commit-linux-containered/ is running and we end up with two inflight Debug builds which fills up the disk.

richardlau commented 3 years ago

Disk full again, with 15G workspaces on two hosts (suggesting we had concurrent debug builds again):

root@test-softlayer-ubuntu1804-docker-x64-1:/home/iojs# df -h .
Filesystem      Size  Used Avail Use% Mounted on
/dev/xvda2       99G   94G  298M 100% /
root@test-softlayer-ubuntu1804-docker-x64-1:/home/iojs# du -hs *
4.0G    test-softlayer-alpine310_container-x64-1
1.6G    test-softlayer-alpine311_container-x64-1
2.3G    test-softlayer-alpine312_container-x64-1
3.7G    test-softlayer-alpine39_container-x64-1
2.2G    test-softlayer-ubi81_container-x64-1
2.5G    test-softlayer-ubuntu1604_arm_cross_container-x64-1
6.6G    test-softlayer-ubuntu1804_arm_cross_container-x64-1
208M    test-softlayer-ubuntu1804_container-x64-1
1.8G    test-softlayer-ubuntu1804_sharedlibs_container-x64-1
15G     test-softlayer-ubuntu1804_sharedlibs_container-x64-2
16G     test-softlayer-ubuntu1804_sharedlibs_container-x64-3
2.3G    test-softlayer-ubuntu1804_sharedlibs_container-x64-4
2.4G    test-softlayer-ubuntu1804_sharedlibs_container-x64-5
root@test-softlayer-ubuntu1804-docker-x64-1:

feel free to clean out all of the workspaces, it's not a huge slowdown to recreate them here (like it is on the Pi hosts)

I've gone and wiped all the workspaces.

root@test-softlayer-ubuntu1804-docker-x64-1:/home/iojs# ls -1 | xargs -i bash -c "rm -rf {}/build/workspace/*"
root@test-softlayer-ubuntu1804-docker-x64-1:/home/iojs# du -hs *
162M    test-softlayer-alpine310_container-x64-1
199M    test-softlayer-alpine311_container-x64-1
191M    test-softlayer-alpine312_container-x64-1
146M    test-softlayer-alpine39_container-x64-1
218M    test-softlayer-ubi81_container-x64-1
173M    test-softlayer-ubuntu1604_arm_cross_container-x64-1
173M    test-softlayer-ubuntu1804_arm_cross_container-x64-1
208M    test-softlayer-ubuntu1804_container-x64-1
271M    test-softlayer-ubuntu1804_sharedlibs_container-x64-1
277M    test-softlayer-ubuntu1804_sharedlibs_container-x64-2
269M    test-softlayer-ubuntu1804_sharedlibs_container-x64-3
270M    test-softlayer-ubuntu1804_sharedlibs_container-x64-4
265M    test-softlayer-ubuntu1804_sharedlibs_container-x64-5
root@test-softlayer-ubuntu1804-docker-x64-1:/home/iojs# df -h .
Filesystem      Size  Used Avail Use% Mounted on
/dev/xvda2       99G   38G   57G  40% /
root@test-softlayer-ubuntu1804-docker-x64-1:/home/iojs#
richardlau commented 3 years ago

Cleaned up test-softlayer-ubuntu1804-docker-x64-1 again: before:

root@test-softlayer-ubuntu1804-docker-x64-1:~# df -h /home/iojs/
Filesystem      Size  Used Avail Use% Mounted on
/dev/xvda2       99G   94G   97M 100% /
root@test-softlayer-ubuntu1804-docker-x64-1:~# du -hs /home/iojs/*
162M    /home/iojs/test-softlayer-alpine310_container-x64-1
1.8G    /home/iojs/test-softlayer-alpine311_container-x64-1
2.1G    /home/iojs/test-softlayer-alpine312_container-x64-1
146M    /home/iojs/test-softlayer-alpine39_container-x64-1
2.2G    /home/iojs/test-softlayer-ubi81_container-x64-1
2.0G    /home/iojs/test-softlayer-ubuntu1604_arm_cross_container-x64-1
11G     /home/iojs/test-softlayer-ubuntu1804_arm_cross_container-x64-1
246M    /home/iojs/test-softlayer-ubuntu1804_container-x64-1
2.4G    /home/iojs/test-softlayer-ubuntu1804_sharedlibs_container-x64-1
17G     /home/iojs/test-softlayer-ubuntu1804_sharedlibs_container-x64-2
16G     /home/iojs/test-softlayer-ubuntu1804_sharedlibs_container-x64-3
2.5G    /home/iojs/test-softlayer-ubuntu1804_sharedlibs_container-x64-4
2.3G    /home/iojs/test-softlayer-ubuntu1804_sharedlibs_container-x64-5
root@test-softlayer-ubuntu1804-docker-x64-1:~#
Trott commented 3 years ago

Any chance something similar is going on with https://github.com/nodejs/build/issues/2588?

richardlau commented 3 years ago

Possibly? Maybe bring test-softlayer-ubuntu1804_sharedlibs_container-x64-3 back online and see if builds still fail on it?

Trott commented 3 years ago

Hmmm, looks like it's already back online.

rvagg commented 3 years ago

probably related to an ld failure, likely all goes back to #2573

mhdawson commented 3 years ago

@richardlau is the cleanup that originated this issues something that we might enable build helpers to be able to do?

richardlau commented 3 years ago

@richardlau is the cleanup that originated this issues something that we might enable build helpers to be able to do?

Yes, it would be a good candidate.

mhdawson commented 3 years ago

We should add it to a list somewhere, @AshCripps do you have anything like that created ?

richardlau commented 3 years ago

FWIW I've kept this issue open as the underlying issue is that we run multiple containers (5 at the current time) on each docker host and typically we run into problems when two of the containers on the same host are running debug builds. I think we've only seen it happen on the softlayer host although whether that is due to how Jenkins schedules across all the containers or possibly the softlayer host has less available disk space than the two digitalocean hosts (I'll check the disk sizes tomorrow).

richardlau commented 3 years ago

FWIW re. available disk space:

$ ansible -m shell -a "df -h /home/iojs" "*_docker-*x64*"
test-digitalocean-ubuntu1804_docker-x64-1 | CHANGED | rc=0 >>
Filesystem      Size  Used Avail Use% Mounted on
/dev/vda1       194G  137G   58G  71% /
test-digitalocean-ubuntu1804_docker-x64-2 | CHANGED | rc=0 >>
Filesystem      Size  Used Avail Use% Mounted on
/dev/vda1       194G  142G   53G  73% /
test-softlayer-ubuntu1804_docker-x64-1 | CHANGED | rc=0 >>
Filesystem      Size  Used Avail Use% Mounted on
/dev/xvda2       99G   80G   15G  85% /

So it looks like the SoftLayer host has half the storage compared to the two Digital Ocean hosts.

richardlau commented 3 years ago

image

(Joyent hosts are expected to be offline.)

iojs@test-digitalocean-ubuntu1804-docker-x64-1:~$ df -h .
Filesystem      Size  Used Avail Use% Mounted on
/dev/vda1       194G  194G     0 100% /
iojs@test-digitalocean-ubuntu1804-docker-x64-1:~$ du -hs /home/iojs/test-digitalocean*
6.0G    /home/iojs/test-digitalocean-alpine310_container-x64-1
2.4G    /home/iojs/test-digitalocean-alpine311_container-x64-1
2.4G    /home/iojs/test-digitalocean-alpine312_container-x64-1
3.9G    /home/iojs/test-digitalocean-alpine39_container-x64-1
2.7G    /home/iojs/test-digitalocean-ubi81_container-x64-1
3.4G    /home/iojs/test-digitalocean-ubuntu1604_arm_cross_container-x64-1
264M    /home/iojs/test-digitalocean-ubuntu1604_container-x64-1
7.0G    /home/iojs/test-digitalocean-ubuntu1804_arm_cross_container-x64-1
264M    /home/iojs/test-digitalocean-ubuntu1804_container-x64-1
du: cannot read directory '/home/iojs/test-digitalocean-ubuntu1804_sharedlibs_container-x64-1/node-tmp/.tmp.2022qxobQR/middle': Permission denied
15G     /home/iojs/test-digitalocean-ubuntu1804_sharedlibs_container-x64-1
19G     /home/iojs/test-digitalocean-ubuntu1804_sharedlibs_container-x64-3
du: cannot read directory '/home/iojs/test-digitalocean-ubuntu1804_sharedlibs_container-x64-5/node-tmp/.tmp.20220cfyNU/middle': Permission denied
15G     /home/iojs/test-digitalocean-ubuntu1804_sharedlibs_container-x64-5
du: cannot read directory '/home/iojs/test-digitalocean-ubuntu1804_sharedlibs_container-x64-7/node-tmp/.tmp.2022wqQuzr/middle': Permission denied
15G     /home/iojs/test-digitalocean-ubuntu1804_sharedlibs_container-x64-7
4.5G    /home/iojs/test-digitalocean-ubuntu1804_sharedlibs_container-x64-9
iojs@test-digitalocean-ubuntu1804-docker-x64-1:~$

i.e. four debug builds

iojs@test-softlayer-ubuntu1804-docker-x64-1:~$ df -h .
Filesystem      Size  Used Avail Use% Mounted on
/dev/xvda2       99G   76G   19G  81% /
iojs@test-softlayer-ubuntu1804-docker-x64-1:~$ du -hs /home/iojs/test-softlayer*
2.6G    /home/iojs/test-softlayer-alpine311_container-x64-1
2.6G    /home/iojs/test-softlayer-alpine312_container-x64-1
2.5G    /home/iojs/test-softlayer-ubi81_container-x64-1
2.4G    /home/iojs/test-softlayer-ubuntu1604_arm_cross_container-x64-1
13G     /home/iojs/test-softlayer-ubuntu1804_arm_cross_container-x64-1
264M    /home/iojs/test-softlayer-ubuntu1804_container-x64-1
2.4G    /home/iojs/test-softlayer-ubuntu1804_sharedlibs_container-x64-1
4.3G    /home/iojs/test-softlayer-ubuntu1804_sharedlibs_container-x64-2
327M    /home/iojs/test-softlayer-ubuntu1804_sharedlibs_container-x64-3
340M    /home/iojs/test-softlayer-ubuntu1804_sharedlibs_container-x64-4
du: cannot read directory '/home/iojs/test-softlayer-ubuntu1804_sharedlibs_container-x64-5/tmp/.tmp.2183JNkJXS/middle': Permission denied
12G     /home/iojs/test-softlayer-ubuntu1804_sharedlibs_container-x64-5
iojs@test-softlayer-ubuntu1804-docker-x64-1:~$

Maybe Jenkins hasn't picked up on the space freed in https://github.com/nodejs/build/issues/2611?

And for completeness

iojs@test-digitalocean-ubuntu1804-docker-x64-2:~$ df -h /home/iojs/
Filesystem      Size  Used Avail Use% Mounted on
/dev/vda1       194G  142G   53G  74% /
iojs@test-digitalocean-ubuntu1804-docker-x64-2:~$ du -hs /home/iojs/test-digitalocean-*
4.1G    /home/iojs/test-digitalocean-alpine310_container-x64-2
4.4G    /home/iojs/test-digitalocean-alpine311_container-x64-2
4.6G    /home/iojs/test-digitalocean-alpine312_container-x64-2
154M    /home/iojs/test-digitalocean-alpine39_container-x64-2
du: cannot read directory '/home/iojs/test-digitalocean-ubi81_container-x64-2/node-tmp/.tmp.2022GSfSJR/middle': Permission denied
4.4G    /home/iojs/test-digitalocean-ubi81_container-x64-2
4.2G    /home/iojs/test-digitalocean-ubuntu1604_arm_cross_container-x64-2
264M    /home/iojs/test-digitalocean-ubuntu1604_container-x64-2
3.1G    /home/iojs/test-digitalocean-ubuntu1804_arm_cross_container-x64-2
264M    /home/iojs/test-digitalocean-ubuntu1804_container-x64-2
du: cannot read directory '/home/iojs/test-digitalocean-ubuntu1804_sharedlibs_container-x64-10/node-tmp/.tmp.2022n8C2fY/middle': Permission denied
1.9G    /home/iojs/test-digitalocean-ubuntu1804_sharedlibs_container-x64-10
2.0G    /home/iojs/test-digitalocean-ubuntu1804_sharedlibs_container-x64-2
4.1G    /home/iojs/test-digitalocean-ubuntu1804_sharedlibs_container-x64-4
2.1G    /home/iojs/test-digitalocean-ubuntu1804_sharedlibs_container-x64-6
3.7G    /home/iojs/test-digitalocean-ubuntu1804_sharedlibs_container-x64-8
iojs@test-digitalocean-ubuntu1804-docker-x64-2:~$
richardlau commented 3 years ago

Cleared the workspaces from test-digitalocean-ubuntu1804-docker-x64-1. Eventually Jenkins reenabled the containers on that host and the softlayer one.

richardlau commented 3 years ago

The debug builds are so much bigger than the release builds that we might want to rethink our container strategy... perhaps we could get away with reducing the number of sharedlibs containers on each docker host from five to four and having a dedicated container for debug builds? That would prevent having multiple debug builds running at the same time on any single docker host. It would cut the number of available executors for the debug builds from fifteen down to three but we do not have the disk capacity to run five debug builds (absolute worst case scenario in the current setup) on any of our docker hosts.

rvagg commented 3 years ago

yeah, that's not a bad idea. We could also do post-build cleanup, I don't think we have that enabled for these builds.

richardlau commented 3 years ago

Post build cleanup is an option but wouldn't prevent disk space issues for multiple debug builds in progress on the same docker host at the same time (but would at least recover the space for the next builds).

I'll add separating out the debug builds into its own container to my list of things to do.

richardlau commented 3 years ago

All the softlayer containers were automatically marked offline in Jenkins due to low disk space. FTR:

root@test-softlayer-ubuntu1804-docker-x64-1:~# df -h
Filesystem      Size  Used Avail Use% Mounted on
udev             16G     0   16G   0% /dev
tmpfs           3.2G  1.6M  3.2G   1% /run
/dev/xvda2       99G   94G   96M 100% /
tmpfs            16G     0   16G   0% /dev/shm
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs            16G     0   16G   0% /sys/fs/cgroup
/dev/xvda1      240M  104M  124M  46% /boot
overlay          99G   94G   96M 100% /var/lib/docker/overlay2/b5f1237849a699cab8de71d79c8f93057f40a3f4c2a5e471839461180f54a566/merged
overlay          99G   94G   96M 100% /var/lib/docker/overlay2/6b3a9c2755a5e07741d33a049cccdc5379257b8bc14fd1b61cac388e45e943af/merged
overlay          99G   94G   96M 100% /var/lib/docker/overlay2/8ee005980aa672b8043512e5a725fc028030c6c8d1b9c136198640c623936538/merged
overlay          99G   94G   96M 100% /var/lib/docker/overlay2/ab4133c96644f6a0a829e4b0cb76ffe33f8d838cb0458843fc6dc4f85e981f5d/merged
overlay          99G   94G   96M 100% /var/lib/docker/overlay2/6e122db71282eebedf790d6f13111c63480e1bb8388ba89fbf1b7619609d8a3b/merged
overlay          99G   94G   96M 100% /var/lib/docker/overlay2/1a5b621e6e2d415a4a20ef0595ca1f66a42c20dcab15e7535a663e7b489e7916/merged
overlay          99G   94G   96M 100% /var/lib/docker/overlay2/0647366017fc8137ba120a6d7db85e9737ae090a96fdcbe946d75d89ff1098fd/merged
overlay          99G   94G   96M 100% /var/lib/docker/overlay2/0b96411bd91465cdce13d8a47758995f0c066a341e4f77c1c7b80366cb653ecb/merged
overlay          99G   94G   96M 100% /var/lib/docker/overlay2/c542054ec9703c9d5e7d42c23c0dbf6a3469a487370b83ccd0aceb238192a003/merged
overlay          99G   94G   96M 100% /var/lib/docker/overlay2/32e66c944d9d650f4913ff7001647ce82ded462d6417db6e1cbcb166ff484199/merged
overlay          99G   94G   96M 100% /var/lib/docker/overlay2/c7f8c6a9c3790ace1b8ebcbc3f3e5409d02ce730f950e261db1a58f88c69b382/merged
tmpfs           3.2G     0  3.2G   0% /run/user/0
root@test-softlayer-ubuntu1804-docker-x64-1:~# cd /home/iojs
root@test-softlayer-ubuntu1804-docker-x64-1:/home/iojs# du -hs test-softlayer-*/build/workspace/*
2.3G    test-softlayer-alpine311_container-x64-1/build/workspace/node-test-commit-linux
1.2G    test-softlayer-alpine311_container-x64-1/build/workspace/node-test-commit-linux-richardlau
2.3G    test-softlayer-alpine312_container-x64-1/build/workspace/node-test-commit-linux
2.2G    test-softlayer-ubi81_container-x64-1/build/workspace/node-test-commit-linux-containered
4.0K    test-softlayer-ubi81_container-x64-1/build/workspace/node-test-commit-linux-containered@tmp
2.2G    test-softlayer-ubuntu1604_arm_cross_container-x64-1/build/workspace/node-cross-compile
20K     test-softlayer-ubuntu1604_arm_cross_container-x64-1/build/workspace/node-cross-compile@tmp
7.7G    test-softlayer-ubuntu1804_arm_cross_container-x64-1/build/workspace/node-cross-compile
32K     test-softlayer-ubuntu1804_arm_cross_container-x64-1/build/workspace/node-cross-compile@tmp
9.5G    test-softlayer-ubuntu1804_sharedlibs_container-x64-1/build/workspace/node-test-commit-linux-containered
4.0K    test-softlayer-ubuntu1804_sharedlibs_container-x64-1/build/workspace/node-test-commit-linux-containered@tmp
1.7G    test-softlayer-ubuntu1804_sharedlibs_container-x64-2/build/workspace/node-test-commit-linux-containered
4.0K    test-softlayer-ubuntu1804_sharedlibs_container-x64-2/build/workspace/node-test-commit-linux-containered@tmp
4.0K    test-softlayer-ubuntu1804_sharedlibs_container-x64-3/build/workspace/node-test-commit-linux-containered@tmp
18G     test-softlayer-ubuntu1804_sharedlibs_container-x64-4/build/workspace/node-test-commit-linux-containered
4.0K    test-softlayer-ubuntu1804_sharedlibs_container-x64-4/build/workspace/node-test-commit-linux-containered@tmp
2.3G    test-softlayer-ubuntu1804_sharedlibs_container-x64-5/build/workspace/node-test-commit-linux-containered
4.0K    test-softlayer-ubuntu1804_sharedlibs_container-x64-5/build/workspace/node-test-commit-linux-containered@tmp
root@test-softlayer-ubuntu1804-docker-x64-1:/home/iojs#

Cleared test-softlayer-ubuntu1804_sharedlibs_container-x64-1/build/workspace/node-test-commit-linux-containered and test-softlayer-ubuntu1804_sharedlibs_container-x64-4/build/workspace/node-test-commit-linux-containered and the hosts (eventually) enabled themselves again.

I've also gone and removed the ubuntu1804_sharedlibs_debug_x64 label in Jenkins from four of the containers at softlayer (-2 to -5) meaning that only test-softlayer-ubuntu1804_sharedlibs_container-x64-1 will schedule debug builds. This seems a quick way to preventing multiple debug builds on the softlayer host -- will need to keep a look out to see if that causes issues for the digital ocean containers. If it does I'll look at the suggestion I made in https://github.com/nodejs/build/issues/2494#issuecomment-814803868.

richardlau commented 3 years ago

Softlayer docker host is out of space again, reported in https://github.com/nodejs/build/issues/2664 and https://github.com/nodejs/build/issues/2665.

root@test-softlayer-ubuntu1804-docker-x64-1:~# du -hs /home/iojs/*
0       /home/iojs/
2.5G    /home/iojs/test-softlayer-alpine311_container-x64-1
2.5G    /home/iojs/test-softlayer-alpine312_container-x64-1
2.5G    /home/iojs/test-softlayer-ubi81_container-x64-1
269M    /home/iojs/test-softlayer-ubuntu1604_arm_cross_container-x64-1
33G     /home/iojs/test-softlayer-ubuntu1804_arm_cross_container-x64-1
290M    /home/iojs/test-softlayer-ubuntu1804_container-x64-1
2.3G    /home/iojs/test-softlayer-ubuntu1804_sharedlibs_container-x64-1
2.4G    /home/iojs/test-softlayer-ubuntu1804_sharedlibs_container-x64-2
2.5G    /home/iojs/test-softlayer-ubuntu1804_sharedlibs_container-x64-3
2.6G    /home/iojs/test-softlayer-ubuntu1804_sharedlibs_container-x64-4
2.5G    /home/iojs/test-softlayer-ubuntu1804_sharedlibs_container-x64-5
root@test-softlayer-ubuntu1804-docker-x64-1:~#

33G for /home/iojs/test-softlayer-ubuntu1804_arm_cross_container-x64-1 being the outlier here.

richardlau commented 3 years ago

I've removed the /home/iojs/test-softlayer-ubuntu1804_arm_cross_container-x64-1/build/workspace/node-cross-compile directory.

root@test-softlayer-ubuntu1804-docker-x64-1:~# du -hs /home/iojs/*
0       /home/iojs/
2.5G    /home/iojs/test-softlayer-alpine311_container-x64-1
2.5G    /home/iojs/test-softlayer-alpine312_container-x64-1
2.5G    /home/iojs/test-softlayer-ubi81_container-x64-1
269M    /home/iojs/test-softlayer-ubuntu1604_arm_cross_container-x64-1
269M    /home/iojs/test-softlayer-ubuntu1804_arm_cross_container-x64-1
290M    /home/iojs/test-softlayer-ubuntu1804_container-x64-1
2.3G    /home/iojs/test-softlayer-ubuntu1804_sharedlibs_container-x64-1
2.4G    /home/iojs/test-softlayer-ubuntu1804_sharedlibs_container-x64-2
2.5G    /home/iojs/test-softlayer-ubuntu1804_sharedlibs_container-x64-3
2.6G    /home/iojs/test-softlayer-ubuntu1804_sharedlibs_container-x64-4
2.5G    /home/iojs/test-softlayer-ubuntu1804_sharedlibs_container-x64-5
root@test-softlayer-ubuntu1804-docker-x64-1:~#
richardlau commented 3 years ago

test-softlayer-ubuntu1804_docker-x64-1 was out of space. The test-softlayer-ubuntu1804_arm_cross_container-x64-1 workspace on was 27G -- I've deleted it.

mhdawson commented 3 years ago

@richardlau is this something that occurs often and would it be a good candidate to add to the AWX jobs build helpers can run ?

richardlau commented 2 years ago

There softlayer containers were offline at the beginning of this month (https://github.com/nodejs/build/issues/2803) and that was due to the .git folder in the arm cross-compile workspaces not being cleaned/pruned.

I had to wipe out the workspaces on the softlayer containers on Tuesday as we were out of space again. This didn't seem related to https://github.com/nodejs/build/issues/2803 (i.e. the cross-compile workspaces seemed to be a reasonable size).

Today the containers on test-digitalocean-ubuntu1804_docker-x64-2 are offline for space reasons: image

I'll investigate later this afternoon (I have a medical appointment to attend first).

richardlau commented 2 years ago

Current test-digitalocean-ubuntu1804-docker-x64-2 space usage:

root@test-digitalocean-ubuntu1804-docker-x64-2:~# df -h
Filesystem      Size  Used Avail Use% Mounted on
udev             16G     0   16G   0% /dev
tmpfs           3.2G  324M  2.9G  11% /run
/dev/vda1       194G  194G     0 100% /
tmpfs            16G     0   16G   0% /dev/shm
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs            16G     0   16G   0% /sys/fs/cgroup
/dev/vda15      105M  6.7M   98M   7% /boot/efi
overlay         194G  194G     0 100% /var/lib/docker/overlay2/9bd8b27d2bb7e293a55a1b91cccf15093602dc839ce49a67cb4f7ca221028b66/merged
overlay         194G  194G     0 100% /var/lib/docker/overlay2/b906535907d92563295f4345c2ffffc3c762fc67848849d1e869aa2e40ed2889/merged
overlay         194G  194G     0 100% /var/lib/docker/overlay2/c008e6cf0135cd79f85f14fd4aa46c0f4e1f67b4439ee007b9161c4b9f4d4265/merged
overlay         194G  194G     0 100% /var/lib/docker/overlay2/9cc63648a6ba2d48ca6753210dc0959b6e45aea6277cb010251b158f61a2b584/merged
overlay         194G  194G     0 100% /var/lib/docker/overlay2/5cac407aba5a15f229addd9904605df7fbad3a241144e23b57e45965cd51c390/merged
overlay         194G  194G     0 100% /var/lib/docker/overlay2/cb19d48ebf158b985f06438c60e1e39a89c8c6aa51ab8ed7407173c8b9cef984/merged
overlay         194G  194G     0 100% /var/lib/docker/overlay2/508bb2ca92eaa09f49f102d4215f3f37f3cbfa2d018d4fc9257d2a2e9c79088e/merged
overlay         194G  194G     0 100% /var/lib/docker/overlay2/2de854f1186395900422662e192607148094b383462ba8d7b56149fe84dd3b28/merged
overlay         194G  194G     0 100% /var/lib/docker/overlay2/2052de449009e2d9f9f17c7cfc8b43bfd981a2eeb956f816f09ccb859a216032/merged
overlay         194G  194G     0 100% /var/lib/docker/overlay2/37e866d149ac586cbf5f57000fc8f8263833336b66f791b6a77378907aa8a1df/merged
overlay         194G  194G     0 100% /var/lib/docker/overlay2/36320a9f453d1da6b0e498d13b03853e794b1e21ed39da41c2ee6054df3fabdd/merged
overlay         194G  194G     0 100% /var/lib/docker/overlay2/026ab3aa9261e923ae35edb2ae5ac315cc444da187ef24eb23c77e49cbd82224/merged
tmpfs           3.2G     0  3.2G   0% /run/user/0
root@test-digitalocean-ubuntu1804-docker-x64-2:~# du -hs /home/iojs/.ccache /home/iojs/*
39G     /home/iojs/.ccache
17M     /home/iojs/jenkins_diagnostics.txt
197M    /home/iojs/remoting
852K    /home/iojs/slave.jar
2.9G    /home/iojs/test-digitalocean-alpine311_container-x64-2
2.7G    /home/iojs/test-digitalocean-alpine312_container-x64-2
2.5G    /home/iojs/test-digitalocean-ubi81_container-x64-2
2.8G    /home/iojs/test-digitalocean-ubuntu1604_arm_cross_container-x64-2
340M    /home/iojs/test-digitalocean-ubuntu1604_container-x64-2
3.1G    /home/iojs/test-digitalocean-ubuntu1804_arm_cross_container-x64-2
340M    /home/iojs/test-digitalocean-ubuntu1804_container-x64-2
23G     /home/iojs/test-digitalocean-ubuntu1804_sharedlibs_container-x64-10
23G     /home/iojs/test-digitalocean-ubuntu1804_sharedlibs_container-x64-2
23G     /home/iojs/test-digitalocean-ubuntu1804_sharedlibs_container-x64-4
23G     /home/iojs/test-digitalocean-ubuntu1804_sharedlibs_container-x64-6
23G     /home/iojs/test-digitalocean-ubuntu1804_sharedlibs_container-x64-8
11M     /home/iojs/tools
4.0K    /home/iojs/workspace
root@test-digitalocean-ubuntu1804-docker-x64-2:~#
richardlau commented 2 years ago

The obvious outlier is that it looks like all 5 sharedlibs_containers were trying to build debug builds at the same time. I think this has been exacerbated by the increase in build times (being addressed by https://github.com/nodejs/node/pull/40934) which is leading to builds being queued up and keeping all of the containers busy.

I've run a git clean -fdX in all the workspaces to claim back space (removing the workspace directories entirely would also work to claim back space but that tends to lead to the first jobs that run failing to resolve refs/remotes/origin/_jenkins_local_branch on first use):

root@test-digitalocean-ubuntu1804-docker-x64-2:~# find /home/iojs/*/build/workspace/* -type d -prune -not -name *@tmp -exec sh -c "cd {} && git clean -fdX" \;
...
root@test-digitalocean-ubuntu1804-docker-x64-2:~# du -hs /home/iojs/.ccache /home/iojs/*
39G     /home/iojs/.ccache
17M     /home/iojs/jenkins_diagnostics.txt
197M    /home/iojs/remoting
852K    /home/iojs/slave.jar
1.8G    /home/iojs/test-digitalocean-alpine311_container-x64-2
1.7G    /home/iojs/test-digitalocean-alpine312_container-x64-2
1.8G    /home/iojs/test-digitalocean-ubi81_container-x64-2
2.3G    /home/iojs/test-digitalocean-ubuntu1604_arm_cross_container-x64-2
340M    /home/iojs/test-digitalocean-ubuntu1604_container-x64-2
2.3G    /home/iojs/test-digitalocean-ubuntu1804_arm_cross_container-x64-2
340M    /home/iojs/test-digitalocean-ubuntu1804_container-x64-2
1.9G    /home/iojs/test-digitalocean-ubuntu1804_sharedlibs_container-x64-10
1.8G    /home/iojs/test-digitalocean-ubuntu1804_sharedlibs_container-x64-2
1.9G    /home/iojs/test-digitalocean-ubuntu1804_sharedlibs_container-x64-4
2.0G    /home/iojs/test-digitalocean-ubuntu1804_sharedlibs_container-x64-6
1.9G    /home/iojs/test-digitalocean-ubuntu1804_sharedlibs_container-x64-8
11M     /home/iojs/tools
4.0K    /home/iojs/workspace
root@test-digitalocean-ubuntu1804-docker-x64-2:~# df -h
Filesystem      Size  Used Avail Use% Mounted on
udev             16G     0   16G   0% /dev
tmpfs           3.2G  1.6M  3.2G   1% /run
/dev/vda1       194G   88G  106G  46% /
tmpfs            16G     0   16G   0% /dev/shm
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs            16G     0   16G   0% /sys/fs/cgroup
/dev/vda15      105M  6.7M   98M   7% /boot/efi
overlay         194G   88G  106G  46% /var/lib/docker/overlay2/9bd8b27d2bb7e293a55a1b91cccf15093602dc839ce49a67cb4f7ca221028b66/merged
overlay         194G   88G  106G  46% /var/lib/docker/overlay2/b906535907d92563295f4345c2ffffc3c762fc67848849d1e869aa2e40ed2889/merged
overlay         194G   88G  106G  46% /var/lib/docker/overlay2/c008e6cf0135cd79f85f14fd4aa46c0f4e1f67b4439ee007b9161c4b9f4d4265/merged
overlay         194G   88G  106G  46% /var/lib/docker/overlay2/9cc63648a6ba2d48ca6753210dc0959b6e45aea6277cb010251b158f61a2b584/merged
overlay         194G   88G  106G  46% /var/lib/docker/overlay2/5cac407aba5a15f229addd9904605df7fbad3a241144e23b57e45965cd51c390/merged
overlay         194G   88G  106G  46% /var/lib/docker/overlay2/cb19d48ebf158b985f06438c60e1e39a89c8c6aa51ab8ed7407173c8b9cef984/merged
overlay         194G   88G  106G  46% /var/lib/docker/overlay2/508bb2ca92eaa09f49f102d4215f3f37f3cbfa2d018d4fc9257d2a2e9c79088e/merged
overlay         194G   88G  106G  46% /var/lib/docker/overlay2/2de854f1186395900422662e192607148094b383462ba8d7b56149fe84dd3b28/merged
overlay         194G   88G  106G  46% /var/lib/docker/overlay2/2052de449009e2d9f9f17c7cfc8b43bfd981a2eeb956f816f09ccb859a216032/merged
overlay         194G   88G  106G  46% /var/lib/docker/overlay2/37e866d149ac586cbf5f57000fc8f8263833336b66f791b6a77378907aa8a1df/merged
overlay         194G   88G  106G  46% /var/lib/docker/overlay2/36320a9f453d1da6b0e498d13b03853e794b1e21ed39da41c2ee6054df3fabdd/merged
overlay         194G   88G  106G  46% /var/lib/docker/overlay2/026ab3aa9261e923ae35edb2ae5ac315cc444da187ef24eb23c77e49cbd82224/merged
tmpfs           3.2G     0  3.2G   0% /run/user/0
root@test-digitalocean-ubuntu1804-docker-x64-2:~#

I'll try and find some time to implement the separation of the debug builds (ref: https://github.com/libuv/libuv/issues/3349#issuecomment-957994866).

richardlau commented 2 years ago

@mhdawson turned off the x64 debug builds for master (Node.js 18 onwards) (https://github.com/nodejs/build/issues/2837#issuecomment-999824238) which has alleviated the disk space pressure somewhat.

There remains a discrepancy between the available disk space to the SoftLayer (IBM) host vs the two Digital Ocean ones (https://github.com/nodejs/build/issues/2494#issuecomment-811858589). e.g.

$ ssh test-digitalocean-ubuntu1804_docker-x64-1 df -h /home/iojs
Filesystem      Size  Used Avail Use% Mounted on
/dev/vda1       194G  106G   89G  55% /
$ ssh test-digitalocean-ubuntu1804_docker-x64-2 df -h /home/iojs
Filesystem      Size  Used Avail Use% Mounted on
/dev/vda1       194G   98G   97G  51% /
$ ssh test-softlayer-ubuntu1804_docker-x64-1 df -h /home/iojs
Filesystem      Size  Used Avail Use% Mounted on
/dev/xvda2       99G   79G   16G  84% /
$

I think it makes sense to bump the storage on the SoftLayer machine to 200GB (I think @mhdawson and I may have had this conversation some time ago). I believe on IBM Cloud this is done by adding "Portable storage". We currently have image The recommendation is that the portable storage be in the same location as the server it's being attached to -- our SoftLayer docker host is in Dallas 13. I propose we remove the unattached portable storage and then resize test-softlayer-ubuntu1804-docker-x64-1 with an extra 200GB portable storage.

(n.b. I have a vague recollection that the unattached "jenkins-release-new" is from when we had to rebuild the release CI and had issues attaching the storage and had to get IBM support involved. The release CI server is currently using the shown attached "jenkins-release" portable storage.)

mhdawson commented 2 years ago

I have the same vague memory as you. +1 on your suggestions.

github-actions[bot] commented 1 year ago

This issue is stale because it has been open many days with no activity. It will be closed soon unless the stale label is removed or a comment is made.

richardlau commented 1 year ago

Noticed we've run out of space again on the softlayer host. e.g. https://ci.nodejs.org/job/node-cross-compile/42407/nodes=cross-compiler-rhel8-armv7-gcc-8-glibc-2.28/console

21:14:03 + git gc
21:15:01 fatal: sha1 file '.git/objects/pack/tmp_idx_Dojb52' write error: No space left on device
21:15:01 fatal: failed to run repack

On the host

iojs@test-softlayer-ubuntu1804-docker-x64-1:~$ du -hs /home/iojs/*
0       /home/iojs/
du: cannot access '/home/iojs/test-softlayer-alpine311_container-x64-1/node-tmp/.tmp.2047IpbCEh/middle/leaf': Permission denied
3.4G    /home/iojs/test-softlayer-alpine311_container-x64-1
3.4G    /home/iojs/test-softlayer-alpine312_container-x64-1
6.1G    /home/iojs/test-softlayer-rhel8_arm_cross_container-x64-1
du: cannot access '/home/iojs/test-softlayer-ubi81_container-x64-1/node-tmp/.tmp.20478LZJ1N/middle/leaf': Permission denied
3.3G    /home/iojs/test-softlayer-ubi81_container-x64-1
372M    /home/iojs/test-softlayer-ubuntu1604_arm_cross_container-x64-1
2.8G    /home/iojs/test-softlayer-ubuntu1804_arm_cross_container-x64-1
471M    /home/iojs/test-softlayer-ubuntu1804_container-x64-1
du: cannot access '/home/iojs/test-softlayer-ubuntu1804_sharedlibs_container-x64-1/node-tmp/.tmp.2047eB1isj/middle/leaf': Permission denied
3.0G    /home/iojs/test-softlayer-ubuntu1804_sharedlibs_container-x64-1
3.2G    /home/iojs/test-softlayer-ubuntu1804_sharedlibs_container-x64-2
du: cannot access '/home/iojs/test-softlayer-ubuntu1804_sharedlibs_container-x64-3/node-tmp/.tmp.2047YdGl5b/middle/leaf': Permission denied
2.9G    /home/iojs/test-softlayer-ubuntu1804_sharedlibs_container-x64-3
du: cannot access '/home/iojs/test-softlayer-ubuntu1804_sharedlibs_container-x64-4/tmp/.tmp.2130N0UDsX/middle/leaf': Permission denied
3.5G    /home/iojs/test-softlayer-ubuntu1804_sharedlibs_container-x64-4
du: cannot read directory '/home/iojs/test-softlayer-ubuntu1804_sharedlibs_container-x64-5/tmp/.tmp.2183JNkJXS/middle': Permission denied
2.9G    /home/iojs/test-softlayer-ubuntu1804_sharedlibs_container-x64-5
iojs@test-softlayer-ubuntu1804-docker-x64-1:~$

I've removed the 6G cross-compile workspace (rm -rf /home/iojs/test-softlayer-rhel8_arm_cross_container-x64-1/build/workspace/node-cross-compile*).

richardlau commented 1 year ago

Softlayer host was out of space again today.

root@test-softlayer-ubuntu1804-docker-x64-1:~# df
Filesystem     1K-blocks     Used Available Use% Mounted on
udev            16443300        0  16443300   0% /dev
tmpfs            3291328     1608   3289720   1% /run
/dev/xvda2     102821812 98460812         0 100% /
tmpfs           16456636        0  16456636   0% /dev/shm
tmpfs               5120        0      5120   0% /run/lock
tmpfs           16456636        0  16456636   0% /sys/fs/cgroup
/dev/xvda1        245679   107857    124715  47% /boot
overlay        102821812 98460812         0 100% /var/lib/docker/overlay2/47e5cf38674966721061004a34503ea379b6aaa73d962908775e22154f9240cf/merged
overlay        102821812 98460812         0 100% /var/lib/docker/overlay2/383b99d8ab9bb1e7039652ddb718a7d4593df7efd463885b531176abf08de51e/merged
overlay        102821812 98460812         0 100% /var/lib/docker/overlay2/c10c26015d495c80165a1ac0dbaa70f62c0202147775e0bb6c9b1c84b41e3f68/merged
overlay        102821812 98460812         0 100% /var/lib/docker/overlay2/2a9afb5a6cbfeff74aa5894e96e2de94c5837ca8d684592f402f7bb59e794346/merged
overlay        102821812 98460812         0 100% /var/lib/docker/overlay2/b62ec2f63d4baaa3d9d716a2453c35166936c7c8e7e3289db8d98e95cea6de82/merged
overlay        102821812 98460812         0 100% /var/lib/docker/overlay2/8ea5a7396d62e9e8a60e08933a35443cb0072384edc5d8994e58a56bb968e951/merged
overlay        102821812 98460812         0 100% /var/lib/docker/overlay2/5b8cd61827b6ca9a9ba3e3d99c194347b9fb9db76327ce074d128dac024bed86/merged
overlay        102821812 98460812         0 100% /var/lib/docker/overlay2/1154c46c536260bac80f4d674f43737e7a1cadf925c86b93840430eddaab3c9a/merged
overlay        102821812 98460812         0 100% /var/lib/docker/overlay2/b8e764fed9ed28540f025173d864fc81aa9ba63a207f3d52e839d22d5b7ec880/merged
overlay        102821812 98460812         0 100% /var/lib/docker/overlay2/9a132e34cf47af1b23ea2efdd725736a8c6ed0ff112b0c653cafd16fb472773a/merged
overlay        102821812 98460812         0 100% /var/lib/docker/overlay2/61dbd42d751bf2683c8d684214cf0c87e9bcfbfeab52d251952363c1d556d50d/merged
tmpfs            3291324        0   3291324   0% /run/user/0
root@test-softlayer-ubuntu1804-docker-x64-1:~#

I've run docker image prune and that has reclaimed 1.9 G of space.

root@test-softlayer-ubuntu1804-docker-x64-1:~# df -h
Filesystem      Size  Used Avail Use% Mounted on
udev             16G     0   16G   0% /dev
tmpfs           3.2G  1.6M  3.2G   1% /run
/dev/xvda2       99G   92G  1.9G  98% /
tmpfs            16G     0   16G   0% /dev/shm
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs            16G     0   16G   0% /sys/fs/cgroup
/dev/xvda1      240M  106M  122M  47% /boot
overlay          99G   92G  1.9G  98% /var/lib/docker/overlay2/47e5cf38674966721061004a34503ea379b6aaa73d962908775e22154f9240cf/merged
overlay          99G   92G  1.9G  98% /var/lib/docker/overlay2/383b99d8ab9bb1e7039652ddb718a7d4593df7efd463885b531176abf08de51e/merged
overlay          99G   92G  1.9G  98% /var/lib/docker/overlay2/c10c26015d495c80165a1ac0dbaa70f62c0202147775e0bb6c9b1c84b41e3f68/merged
overlay          99G   92G  1.9G  98% /var/lib/docker/overlay2/2a9afb5a6cbfeff74aa5894e96e2de94c5837ca8d684592f402f7bb59e794346/merged
overlay          99G   92G  1.9G  98% /var/lib/docker/overlay2/b62ec2f63d4baaa3d9d716a2453c35166936c7c8e7e3289db8d98e95cea6de82/merged
overlay          99G   92G  1.9G  98% /var/lib/docker/overlay2/8ea5a7396d62e9e8a60e08933a35443cb0072384edc5d8994e58a56bb968e951/merged
overlay          99G   92G  1.9G  98% /var/lib/docker/overlay2/5b8cd61827b6ca9a9ba3e3d99c194347b9fb9db76327ce074d128dac024bed86/merged
overlay          99G   92G  1.9G  98% /var/lib/docker/overlay2/1154c46c536260bac80f4d674f43737e7a1cadf925c86b93840430eddaab3c9a/merged
overlay          99G   92G  1.9G  98% /var/lib/docker/overlay2/b8e764fed9ed28540f025173d864fc81aa9ba63a207f3d52e839d22d5b7ec880/merged
overlay          99G   92G  1.9G  98% /var/lib/docker/overlay2/9a132e34cf47af1b23ea2efdd725736a8c6ed0ff112b0c653cafd16fb472773a/merged
overlay          99G   92G  1.9G  98% /var/lib/docker/overlay2/61dbd42d751bf2683c8d684214cf0c87e9bcfbfeab52d251952363c1d556d50d/merged
tmpfs           3.2G     0  3.2G   0% /run/user/0
root@test-softlayer-ubuntu1804-docker-x64-1:~#
targos commented 1 year ago

I've just run an additional docker system prune -a: Total reclaimed space: 22.61GB

root@test-softlayer-ubuntu1804-docker-x64-1:~# df -h
Filesystem      Size  Used Avail Use% Mounted on
udev             16G     0   16G   0% /dev
tmpfs           3.2G  1.6M  3.2G   1% /run
/dev/xvda2       99G   81G   13G  87% /
tmpfs            16G     0   16G   0% /dev/shm
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs            16G     0   16G   0% /sys/fs/cgroup
/dev/xvda1      240M  106M  122M  47% /boot
overlay          99G   81G   13G  87% /var/lib/docker/overlay2/47e5cf38674966721061004a34503ea379b6aaa73d962908775e22154f9240cf/merged
overlay          99G   81G   13G  87% /var/lib/docker/overlay2/383b99d8ab9bb1e7039652ddb718a7d4593df7efd463885b531176abf08de51e/merged
overlay          99G   81G   13G  87% /var/lib/docker/overlay2/c10c26015d495c80165a1ac0dbaa70f62c0202147775e0bb6c9b1c84b41e3f68/merged
overlay          99G   81G   13G  87% /var/lib/docker/overlay2/2a9afb5a6cbfeff74aa5894e96e2de94c5837ca8d684592f402f7bb59e794346/merged
overlay          99G   81G   13G  87% /var/lib/docker/overlay2/b62ec2f63d4baaa3d9d716a2453c35166936c7c8e7e3289db8d98e95cea6de82/merged
overlay          99G   81G   13G  87% /var/lib/docker/overlay2/8ea5a7396d62e9e8a60e08933a35443cb0072384edc5d8994e58a56bb968e951/merged
overlay          99G   81G   13G  87% /var/lib/docker/overlay2/5b8cd61827b6ca9a9ba3e3d99c194347b9fb9db76327ce074d128dac024bed86/merged
overlay          99G   81G   13G  87% /var/lib/docker/overlay2/1154c46c536260bac80f4d674f43737e7a1cadf925c86b93840430eddaab3c9a/merged
overlay          99G   81G   13G  87% /var/lib/docker/overlay2/b8e764fed9ed28540f025173d864fc81aa9ba63a207f3d52e839d22d5b7ec880/merged
overlay          99G   81G   13G  87% /var/lib/docker/overlay2/9a132e34cf47af1b23ea2efdd725736a8c6ed0ff112b0c653cafd16fb472773a/merged
overlay          99G   81G   13G  87% /var/lib/docker/overlay2/61dbd42d751bf2683c8d684214cf0c87e9bcfbfeab52d251952363c1d556d50d/merged
tmpfs           3.2G     0  3.2G   0% /run/user/0
richardlau commented 1 year ago

I think it makes sense to bump the storage on the SoftLayer machine to 200GB (I think @mhdawson and I may have had this conversation some time ago). I believe on IBM Cloud this is done by adding "Portable storage". We currently have image The recommendation is that the portable storage be in the same location as the server it's being attached to -- our SoftLayer docker host is in Dallas 13. I propose we remove the unattached portable storage and then resize test-softlayer-ubuntu1804-docker-x64-1 with an extra 200GB portable storage.

I've deleted the unattached Dallas 5 portable storage and requested a new 200GB SAN for test-softlayer-ubuntu1804-docker-x64-1.

richardlau commented 1 year ago
root@test-softlayer-ubuntu1804-docker-x64-1:~# sudo fdisk /dev/xvdc

Welcome to fdisk (util-linux 2.31.1).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.

Device does not contain a recognized partition table.
Created a new DOS disklabel with disk identifier 0xf9d84c67.

Command (m for help): p
Disk /dev/xvdc: 200 GiB, 214748364800 bytes, 419430400 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0xf9d84c67

Command (m for help): n
Partition type
   p   primary (0 primary, 0 extended, 4 free)
   e   extended (container for logical partitions)
Select (default p): p
Partition number (1-4, default 1): 1
First sector (2048-419430399, default 2048):
Last sector, +sectors or +size{K,M,G,T,P} (2048-419430399, default 419430399):

Created a new partition 1 of type 'Linux' and of size 200 GiB.

Command (m for help): w
The partition table has been altered.
Calling ioctl() to re-read partition table.
Syncing disks.

root@test-softlayer-ubuntu1804-docker-x64-1:~# sudo fdisk /dev/xvdc

Welcome to fdisk (util-linux 2.31.1).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.

Command (m for help): p
Disk /dev/xvdc: 200 GiB, 214748364800 bytes, 419430400 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0xf9d84c67

Device     Boot Start       End   Sectors  Size Id Type
/dev/xvdc1       2048 419430399 419428352  200G 83 Linux

Command (m for help): q

root@test-softlayer-ubuntu1804-docker-x64-1:~# sudo mkfs -t ext4 /dev/xvdc1
mke2fs 1.44.1 (24-Mar-2018)
Creating filesystem with 52428544 4k blocks and 13107200 inodes
Filesystem UUID: 09f6b200-7dda-4b9e-b16a-550b14cc1fd5
Superblock backups stored on blocks:
        32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
        4096000, 7962624, 11239424, 20480000, 23887872

Allocating group tables: done
Writing inode tables: done
Creating journal (262144 blocks): done
Writing superblocks and filesystem accounting information: done

root@test-softlayer-ubuntu1804-docker-x64-1:~#
richardlau commented 1 year ago

Copying /home/* over to the new disk.

richardlau commented 1 year ago

/etc/fstab has been updated to mount the new disk as /home/*. I've checked that it works as expected after a reboot. Going to close this as done. The containers on test-softlayer-ubuntu1804-docker-x64-1 are still deliberately offline while I'm testing https://github.com/nodejs/build/pull/3371.