Closed smerle33 closed 1 year ago
steps foreseen :
to increase over 512Gib we will have to change to 1024Gib disk and then change of Disk tiers
We discovered that $JENKINS_HOME/config-history
weights 27Gb!
It's the directory where https://plugins.jenkins.io/jobConfigHistory/ stores the "config changes".
The current setup is the default one: we could set up some of the best practises from https://docs.cloudbees.com/docs/cloudbees-ci-kb/latest/best-practices/jobconfighistory-best-practices to decrease the disk usage (move it on another drive?) and improve I/O performances for ci.j (at least removing the nodes
and tools
from history)
Ping @MarkEWaite , could you share what you did in https://github.com/jenkins-infra/helpdesk/issues/2736 last time it happenned?
I think that I deleted history. I think that we should remove that plugin from the ci.jenkins.io instance and accept that job configuration history is not worth the disc space penalty
ci-data
) of type "full" just in case.nodes
and buildtriggerbadge
, and the associated directories were deleted in $JENKINS_HOME/config-history/
$ df -h .
Filesystem Size Used Avail Use% Mounted on
/dev/sdb1 492G 436G 31G 94% /var/lib/jenkins
bigquery-uploader
, cn.jenkins.io
and community-functions
Scan Organization Triggers
to 1 week
Orphaned Item Strategy
Automatic branch project triggering
-> Suppression strategy
to For matching branches suppress builds triggered by indexing (continue to honor webhooks)
(same as https://github.com/jenkins-infra/helpdesk/issues/3474)=> cleaning up led to a size of 5.6 Gb
Top-level directory "Tools" weight ~ 51 Gb, with almost 45G for bom
Updated its configuration with the following changes:
Scan Organization Triggers
to 1 week
Orphaned Item Strategy
Automatic branch project triggering
-> Suppression strategy
to For matching branches suppress builds triggered by indexing (continue to honor webhooks)
(same as https://github.com/jenkins-infra/helpdesk/issues/3474)The jenkinsci/bom build had hundrerds on builds on its master
branch:
properties([disableConcurrentBuilds(abortPrevious: true), buildDiscarder(logRotator(numToKeepStr: '10'))])
echo 'https://github.com/jenkins-infra/helpdesk/issues/3492 investigation, retain 10 most recent builds'
df -h .
Filesystem Size Used Avail Use% Mounted on
/dev/sdb1 492G 391G 76G 84% /var/lib/jenkins
As seen with @smerle33 , the next "culprit" will be the top-level item "Websites" which weight more than 30 Gb !!
$ df -h /var/lib/jenkins/
Filesystem Size Used Avail Use% Mounted on
/dev/sdb1 492G 388G 79G 84% /var/lib/jenkins
root@ci:~# du -sh /var/lib/jenkins/jobs/Websites/jobs/jenkins.io
36G /var/lib/jenkins/jobs/Websites/jobs/jenkins.io
=> the disk is used by the archived ZIP for each build. Let's remove them:
$ df -h /var/lib/jenkins/
Filesystem Size Used Avail Use% Mounted on
/dev/sdb1 492G 388G 79G 84% /var/lib/jenkins
$ du -sh /var/lib/jenkins/jobs/Websites/jobs/jenkins.io
36G /var/lib/jenkins/jobs/Websites/jobs/jenkins.io
# Remove the ZIP archived files
$ cd /var/lib/jenkins/jobs/Websites/jobs/jenkins.io/branches && find . -type f -name "jenkins*.zip" -exec rm -f {} \;
# Result:
$ du -sh /var/lib/jenkins/jobs/Websites/jobs/jenkins.io
161M /var/lib/jenkins/jobs/Websites/jobs/jenkins.io
$ df -h /var/lib/jenkins/
Filesystem Size Used Avail Use% Mounted on
/dev/sdb1 492G 352G 115G 76% /var/lib/jenkins
This issue is closeable as we went under the 80% usage bar (requirement for good I/O performances).
A few improvement (to be treated as separated issues) as discussed with team:
Closing the issue as operation is finished!
Post-cleanup: @smerle33 ran an ncdu analysis, and found a 61 Gb tgz file in $JENKINS_HOME/.bkp
dated from 1 year ago (25 August 2022). We removed it as not needed (and we have snapshot).
Final status:
$ df -h /var/lib/jenkins/
Filesystem Size Used Avail Use% Mounted on
/dev/sdb1 492G 294G 173G 63% /var/lib/jenkins
this need to be fixed.
Originally posted by @smerle33 in https://github.com/jenkins-infra/helpdesk/issues/3491#issuecomment-1498745076