Closed SethTisue closed 7 years ago
Note that the directories "~/.dbuild/cache*" collect all of the artifacts generated by dbuild, and are never garbage collected (it has been a "todo" in dbuild for a very long time). I don't know about the setup of the community build in particular, but you may want to check the size of those dirs every now and then, and zap them on occasion. That is especially true now that there are many projects and many different configurations involved.
The main culprit appears to be scala-*-integrate-community-build/target-0.9.9/project-builds
.
--- /home/jenkins/workspace/scala-2.12.x-integrate-community-build/target-0.9.9/project-builds ---------------------------------------------------------------
/..
2.9 GiB [##########] /fastparse-7e54355938440ef5e40886af071772172a7b7526
1.4 GiB [#### ] /akka-more-1b1a0f6ddda7da64b652991d4d18cbf7f2b50329
1.2 GiB [#### ] /play-core-1510118be95aeaebc40d3639b662bc5b674bf0d2
1.0 GiB [### ] /scalaz-76bd5764c6f8dd93666bd00579afd389b477dcdb
907.9 MiB [### ] /scalameta-3a5b419089b29bd8ba097282114548ce24784aaa
767.4 MiB [## ] /breeze-b0b808c2bca03de856dc31b3f75e1512cfa7a704
742.7 MiB [## ] /specs2-b035e90f082aa56ca8ba4380c8875e9ad3fc89df
734.0 MiB [## ] /akka-http-962f54885423d7e99115596dcea79518b1ba2fa8
705.4 MiB [## ] /akka-actor-1c695d12af79f1d3ac751aaa911b6f00051b5acb
663.3 MiB [## ] /unfiltered-dac954c1724d5856447807f11b63b3bbd621a089
634.0 MiB [## ] /scalikejdbc-0aaf0abd357f6f4cdfba4540f9c1c7cf8810125b
633.5 MiB [## ] /scalafix-cb58a33be9bae52783ee291c16dbadb8b967e6fa
631.0 MiB [## ] /scala-js-020d304f495e2f9a2ba3734f4f384ef8d469237d
623.1 MiB [## ] /monix-d99c847c5ade24c91e45c41239478e9b93c84e69
607.3 MiB [## ] /cats-84a80371921714c958a0d99bf2c963156f8702de
601.8 MiB [## ] /spire-87d759aa7fd265fb69c2c05dc38633229273cf91
595.1 MiB [## ] /scalatest-a48b2221995e91deb0ce628b653f636caec71266
549.7 MiB [# ] /sbt-librarymanagement-fb47e094ec8efb708200d55a5156846c04df8d97
536.0 MiB [# ] /play-webgoat-a11f1896e96c249eafe2d0e706fb105443af9c58
505.8 MiB [# ] /conductr-lib-bd61d089542d9844695c80737cd873743bedd2cb
480.6 MiB [# ] /twitter-util-f191b661d362603b251f2a55663d36815ee0be2f
479.7 MiB [# ] /play-ws-a4560867b8e0627d0cc6b09510c953876ef100fb
I've proposed a change to fastparse to reduce the space used by its tests. It checkouts a bunch of open source git repos as corpi to test its parsers, but neglected to do a shallow clone in one place. We could disable its tests in the meantime.
Could/should we just run clean
as an extra command for each build so that we only need one populated target
directory at a time?
What is stored in project-builds/**/.dbuild/{local-repo,topIvy}
? These also seem to be space hogs. Could they be deleted after each project build without costing too much on a subsequent run of the community build?
Here's a snapshot of the disk usage generated by, and viewable with, ncdu
.
TIL ncdu
, slick! I'll switch to that from the du -ka . | sort -nr
shell alias I've been using for 25 years
in 1f2859bf70e71ecfb453dc3035ed4dc0e39dc10f I temporarily switched the community build to use @retronym's branch of fastparse (green run: https://scala-ci.typesafe.com/job/scala-2.12.x-integrate-community-build/2075/consoleFull). hopefully that PR will be merged and we can unfork again
leaving the ticket open for now.
Could/should we just run clean as an extra command for each build so that we only need one populated target directory at a time
that is definitely worth considering.
traditionally we have deleted that stuff at the start of each run, rather than the end, in case we need the files in order to do postmortems on failures
in practice, I'd say I've used that capability only a handful of times over the past two years. if it were ever really needed we could do a new run on a branch where the cleanup command is removed/commented. anyway, most problems are reproducible by running the build locally, which is more convenient location for forensics & autopsies.
Is there any performance argument for leaving results of the previous community build in place? I seem to recall that project builds are somewhat incremental, but if we are changing the compiler each time there seems little prospect for avoiding rebuilds.
@cunei @SethTisue taking the idea a bit further, how about a mode in dbuild itself to clean up each project's directory (remove any .dbuild
and target/**
that aren't directly required by downstream builds) at the conclusion of each projects build? The goal would be to reduce the amount of disk needed to run the community build down from the current ~40GB to something more like 10GB.
I had left the target
directory in place in order to facilitate postmortems, but in practice, I've rarely or never used that capability. if I want to do a postmortem, I usually try to reproduce the problem locally where it's more convenient to work with, then go from there. I've rarely or never needed to actually do the postmortem on the behemoth itself.
so I'd be fine with blowing away the target directory in the workspace at the end of the run rather than the beginning. (we'll want to be sure it gets blown away regardless of whether the run succeeded or failed, I think.)
removing ~/.dbuild
is probably a no-go since it's shared by multiple jobs.
note that there is existing code to delete target
, it just currently happens at the beginning of a run, not at the end
removing ~/.dbuild is probably a no-go since it's shared by multiple jobs.
Just to clarify, there are few different folders named .dbuild
. I was hoping to purge projectA/**/.dbuild/**
eagerly, after dbuild project builds that project. But I don't have a model for what parts (if any) of those directories are "outputs" and required for downstream projects.
this hasn't been a problem lately, optimistically closing
I think something must have changed in the community build recently that is making the community builds chew up way more disk than they used to — as in, you can't even run the 2.12 and 2.13 builds once each without ending up out of disk.
needs investigation.