Cleaning up temporary and misc job files on cluster

bio-guoda / guoda-services

Services provided by GUODA, currently a container for tickets and wikis.

MIT License

2 stars 0 forks source link

Closed mjcollin closed 5 years ago

mjcollin commented 5 years ago

The disks collect junk in a couple places:

Fixed with: echo "find /tmp -type f -atime +14 -delete" > /etc/cron.daily/tmp_clean && chmod 0755 /etc/cron.daily/tmp_clean`

Looks like diffs of the running images for efechecka, newer versions of docker have better tools for cleaning these, wait for later.

Old frameworks can be removed however need to know if they're currently running, can't just clean old files.

mjcollin commented 5 years ago

Nodes are full again w/o HDFS being full (although it's at 87%). Specifically when looking at mesos01:

Cron to clean out /tmp is working fine, only a few MB there
Still not cleaned out but only about 3-5 GB (on 900 GB disks) there, not much reward
Old frame works seem to be getting cleaned, only a few MB there

/var/log looks like the culprit this time. Tens of GB in hadoop, journal, and mesos.

mjcollin commented 5 years ago

Added lines to clean /var/log/hadoop and /var/log/mesos of logs older than 60 days to free up a little space quickly.

Mid-term plan, get more stuff off of HDFS by archiving and write out iDigBio monthly instead of weekly to put less stuff on.

jhpoelen commented 5 years ago

Nice!

jhpoelen commented 5 years ago

Closing stale issue.