Open targos opened 3 months ago
/tmp
on test-equinix_mnx-smartos20-x64-3 is full:
$ ansible -m command -a "df -h" test-equinix_mnx-smartos20-x64-?
test-equinix_mnx-smartos20-x64-3 | CHANGED | rc=0 >>
Filesystem Size Used Avail Use% Mounted on
zones/356655a2-12e6-e1d7-ac7b-b5188ad37cb0 100G 30G 71G 30% /
/.zonecontrol 56T 57G 56T 1% /.zonecontrol
/lib 314M 292M 23M 93% /lib
/lib/svc/manifest 56T 1.5M 56T 1% /lib/svc/manifest
/usr 433M 394M 40M 91% /usr
swap 16G 4.6G 12G 29% /etc/svc/volatile
swap 4.0G 4.0G 0 100% /tmp
swap 16G 4.6G 12G 29% /var/run
test-equinix_mnx-smartos20-x64-4 | CHANGED | rc=0 >>
Filesystem Size Used Avail Use% Mounted on
zones/c6e3d47a-1421-ee11-c52d-c3c80c198e95 100G 25G 76G 25% /
/.zonecontrol 56T 57G 56T 1% /.zonecontrol
/lib 314M 292M 23M 93% /lib
/lib/svc/manifest 56T 1.5M 56T 1% /lib/svc/manifest
/usr 433M 394M 40M 91% /usr
swap 16G 2.3G 14G 15% /etc/svc/volatile
swap 4.0G 1.8G 2.3G 45% /tmp
swap 16G 2.3G 14G 15% /var/run
It's full of node-coverage-*
directories. I've removed them:
[root@test-equinix_mnx-smartos20-x64-3 ~]# rm -rf /tmp/node-coverage-*
[root@test-equinix_mnx-smartos20-x64-3 ~]# df -h
Filesystem Size Used Avail Use% Mounted on
zones/356655a2-12e6-e1d7-ac7b-b5188ad37cb0 100G 30G 71G 30% /
/.zonecontrol 56T 57G 56T 1% /.zonecontrol
/lib 314M 292M 23M 93% /lib
/lib/svc/manifest 56T 1.5M 56T 1% /lib/svc/manifest
/usr 433M 394M 40M 91% /usr
swap 16G 955M 16G 6% /etc/svc/volatile
swap 4.0G 44K 4.0G 1% /tmp
swap 16G 955M 16G 6% /var/run
[root@test-equinix_mnx-smartos20-x64-3 ~]#
I've cleared out /tmp/node-coverage-*
on test-equinix_mnx-smartos20-x64-4 as well:
[root@test-equinix_mnx-smartos20-x64-4 ~]# df -h
Filesystem Size Used Avail Use% Mounted on
zones/c6e3d47a-1421-ee11-c52d-c3c80c198e95 100G 25G 76G 25% /
/.zonecontrol 56T 57G 56T 1% /.zonecontrol
/lib 314M 292M 23M 93% /lib
/lib/svc/manifest 56T 1.5M 56T 1% /lib/svc/manifest
/usr 433M 394M 40M 91% /usr
swap 16G 2.3G 14G 15% /etc/svc/volatile
swap 4.0G 1.8G 2.3G 45% /tmp
swap 16G 2.3G 14G 15% /var/run
[root@test-equinix_mnx-smartos20-x64-4 ~]# rm -rf /tmp/node-coverage-*
[root@test-equinix_mnx-smartos20-x64-4 ~]# df -h
Filesystem Size Used Avail Use% Mounted on
zones/c6e3d47a-1421-ee11-c52d-c3c80c198e95 100G 25G 76G 25% /
/.zonecontrol 56T 57G 56T 1% /.zonecontrol
/lib 314M 292M 23M 93% /lib
/lib/svc/manifest 56T 1.5M 56T 1% /lib/svc/manifest
/usr 433M 394M 40M 91% /usr
swap 16G 467M 16G 3% /etc/svc/volatile
swap 4.0G 44K 4.0G 1% /tmp
swap 16G 467M 16G 3% /var/run
[root@test-equinix_mnx-smartos20-x64-4 ~]#
Good catch! So there's a bug in our tests. They shouldn't be using the system's tmpdir.
Good catch! So there's a bug in our tests. They shouldn't be using the system's tmpdir.
FWIW I ran make -j88 test-ci
on a dev machine I have access to and it created these directories/files in /tmp
:
drwx------. 2 rlau rlau 94 Aug 14 13:48 node-coverage-09AcaL
drwx------. 2 rlau rlau 138 Aug 14 13:48 node-coverage-6qYENJ
drwx------. 2 rlau rlau 94 Aug 14 13:48 node-coverage-afMxbz
drwx------. 2 rlau rlau 94 Aug 14 13:48 node-coverage-e0jO5f
drwx------. 2 rlau rlau 138 Aug 14 13:48 node-coverage-F6ZUIw
drwx------. 2 rlau rlau 50 Aug 14 13:48 node-coverage-hHzLhL
drwx------. 2 rlau rlau 94 Aug 14 13:48 node-coverage-Lc7I1n
drwx------. 2 rlau rlau 94 Aug 14 13:48 node-coverage-otzELS
drwx------. 2 rlau rlau 94 Aug 14 13:48 node-coverage-p2PeI0
drwx------. 2 rlau rlau 94 Aug 14 13:48 node-coverage-q6CVl4
drwx------. 2 rlau rlau 50 Aug 14 13:48 node-coverage-RdCe2w
drwx------. 2 rlau rlau 94 Aug 14 13:48 node-coverage-SnPBDM
drwx------. 2 rlau rlau 94 Aug 14 13:48 node-coverage-tw4y03
drwx------. 2 rlau rlau 94 Aug 14 13:48 node-coverage-x0GveC
drwx------. 2 rlau rlau 50 Aug 14 13:48 node-coverage-zBbuSF
-rw-r--r--. 1 rlau rlau 138954 Aug 14 13:47 perf-426641.map
-rw-r--r--. 1 rlau rlau 112792 Aug 14 13:47 perf-426677.map
I think some of those are from test/parallel/test-runner-coverage.js
which has specific tests checking coverage is produced when the NODE_V8_COVERAGE
environment variable is not set.
I think some of those are from
test/parallel/test-runner-coverage.js
which has specific tests checking coverage is produced when theNODE_V8_COVERAGE
environment variable is not set.
If I exclude test/parallel/test-runner-coverage.js
from parallel there still at least one more test writing out coverage:
$ rm -rf /tmp/node-coverage-* /tmp/perf-*.map
$ tools/test.py -J --skip=parallel/test-runner-coverage parallel
[00:28|% 100|+ 3508|- 0]: Done
All tests passed.
$ ls -al /tmp
...
drwx------. 2 rlau rlau 51 Aug 14 17:20 node-coverage-8Oj2sa
drwx------. 2 rlau rlau 96 Aug 14 17:20 node-coverage-Xwd98L
-rw-r--r--. 1 rlau rlau 138954 Aug 14 17:20 perf-1822721.map
-rw-r--r--. 1 rlau rlau 112792 Aug 14 17:20 perf-1822738.map
...
$
Can you try to change the permissions so only root is allowed to write to /tmp
. Hopefully we can identify the tests by observing failures.
Not on that machine, but should be able to in a container.
$ mkdir /tmp/cantwrite
$ chmod u-w /tmp/cantwrite/
$ ls -al /tmp/cantwrite
total 4
dr-xr-xr-x. 2 rlau rlau 6 Aug 14 19:15 .
drwxrwxrwt. 12 nobody nobody 4096 Aug 14 19:15 ..
$ TMPDIR=/tmp/cantwrite/ tools/test.py
...
[03:29|% 100|+ 4014|- 2]: Done
Failed tests:
out/Release/node /home/rlau/sandbox/github/node/test/parallel/test-runner-output.mjs
out/Release/node /home/rlau/sandbox/github/node/test/parallel/test-runner-coverage.js
$
With an actual unwritable /tmp
:
[03:26|% 100|+ 4012|- 4]: Done
Failed tests:
out/Release/node /home/nodejs/node/test/parallel/test-cli-node-options.js
out/Release/node /home/nodejs/node/test/parallel/test-runner-coverage.js
out/Release/node /home/nodejs/node/test/parallel/test-runner-output.mjs
out/Release/node /home/nodejs/node/test/parallel/test-trace-events-http.js
$
test-cli-node-options.js
test-runner-coverage.js
and test-runner-output.mjs
test-trace-events-http.js
is affected by reading the code.For test-runner-coverage.js
- When NODE_V8_COVERAGE
is set the report will be written into the dest directory, currently for test-runner-coverage.js
is set to tmpdir.path
maybe what we need to do is to clear the report directory after the test is finished to prevent the report fill up the disk space, will take another when I wake up.
For
test-runner-coverage.js
- WhenNODE_V8_COVERAGE
is set the report will be written into the dest directory, currently fortest-runner-coverage.js
is set totmpdir.path
maybe what we need to do is to clear the report directory after the test is finished to prevent the report fill up the disk space, will take another when I wake up.
I think those cases are working as expected -- but that test has variants that deliberately do not set NODE_V8_COVERAGE
and those are causing writes to the system tmp dir which are never cleaned up.
Thanks Richard, you are right about this, when I commented out the test that has NODE_V8_COVERAGE
, it still writes to the temp dir and not being cleaned up. I will take some time to investigate why this happened.
I spent some time to further investigate and found out whether or not NODE_V8_COVERAGE
is set or not, it will always write to the system temp dir and then if NODE_V8_COVERAGE
is set, it will copy the coverage report across to that dir. (see here for the implementation detail)
^ this might explain why when the write access is disabled to the temp dir the test failed immediately.
I have a question - after we've done the copying, shouldn't we remove the temp created dir? @cjihrig
after we've done the copying, shouldn't we remove the temp created dir?
We could. I wouldn't expect to need to in the temp directory though. The test could also clean up those files.
It is hard to clean those files after the tests - because the way we created them is using mkdtempSync
.
const coverageDirectory = mkdtempSync(join(tmpdir(), 'node-coverage-'));
which is a unique (random) temporary directory, e.g. node-coverage-060f7B
. And I don't know how to obtain the unique (random) part.
I attempted to remove the tmp dir by doing this:
} finally {
if (dir) {
dir.closeSync();
+ rmSync(this.coverageDirectory, {__proto__: null, recursive: true });
}
}
}
in the cleanup
function, however even the rmSync
was *successful, the temp dir was still there (looks like somehow it was being recreated again)
existSync(this.coverageDirectory)
and it returned false.Any idea? 👀 @cjihrig
Not sure without trying to debug it myself.
If you could try to cd
into $TMPDIR
and rm -rf node-coverage-*
then you could add the patch into the cleanup
function in coverage.js
(make sure you import rmSync
at the top) and run ./node test/parallel/test-runner-coverage.js
.
In theory we shouldn't see any node-covearge-*
but in reality they are still there.
Sorry for being annoying to add another quick note here, the whole reason I wanted to fix this issue is because each time we run the above test (test-runner-coverage.js
) it will create the rest (which is a few mb) into the system tmp folder:
292K node-coverage-0q1ib3
1.3M node-coverage-2N0qE9
460K node-coverage-LHyiP2
68K node-coverage-OMUlHM
524K node-coverage-Pz009O
792K node-coverage-RZckL8
556K node-coverage-WVJ1fu
460K node-coverage-gnrE0T
556K node-coverage-hFiDtO
1.3M node-coverage-mVUPrO
556K node-coverage-nuKBuD
800K node-coverage-s1yPWE
68K node-coverage-tVAHAL
528K node-coverage-teJBAK
This is fine for local but on CI it would easily fill up the space.
test-digitalocean-fedora39-x64-2
is affected too
[root@test-digitalocean-fedora39-x64-2 ~]# ls /tmp/node-coverage-* | wc -l
18601
I cleaned it up, and -1
too.
/tmp
on test-equinix_mnx-smartos20-x64-3 was full again. I have deleted all of those node-coverage-*
directories.
https://ci.nodejs.org/job/node-test-commit-smartos/buildTimeTrend
It's slowing down core development.