cfops backup leaves temp files on Ops Manager

mayrstefan commented 8 years ago

We have cfops backup running in a nightly cron job. Yesterday the backup failed with the following message: 2016/06/15 02:01:08 ... opsmanager.go:147] error in save http request%!(EXTRA *errors.errorString={"error":"No space left on device - sendfile"}) 2016/06/15 02:01:08 ... createCliCommand.go:46] there was an error: {"error":"No space left on device - sendfile"} running backup on ops-manager tile:&{{/srv/backup/pcf false 0xc820152bd0} ... /srv/backup/pcf/opsmanager/deployments opsmanager ubuntu ...

Then we checked the Ops Manager VM: there was a folder of ~5GB for every backup run left behin in /tmp/ops_manager/ and the root partition was 100% full. So we deleted everything in /tmp/ops_manager and the backup continued to run.

Example of the last two days: # du -hs /tmp/ops_manager/* 5.1G /tmp/ops_manager/d20160615-1096-d1t5cv 5.1G /tmp/ops_manager/d20160615-1096-u87ttr 5.1G /tmp/ops_manager/d20160616-1096-10ubl1u 5.1G /tmp/ops_manager/d20160616-1096-upm66p

mayrstefan commented 8 years ago

I missed the latest release: so I upgraded to v2.2.26 and ran the backup again. After the backup is finished there are around 10GB left in /tmp/ops_manager

root@pivotal-ops-manager:/tmp/ops_manager# du -hs * 5.1G d20160617-1096-1q9lbqt 5.1G d20160617-1096-f7c9mj

xchapter7x commented 8 years ago

When cfops backs up the ops manager it simply calls the ops manager export api endpoint. Calling this endpoint generates a temp file before it pushes back in the HTTP response.

What you are describing is these ops manager tmp files filling your disk. That is not a function of cfops, it is simply how ops manager functions.

What other clients have done to work around this is either restart the ops manager vm or automate the clearing of the tmp dir on the box.

What version of ops manager are you using? I believe there was a fix in the latest version where ops manager will clear its tmp on its own.

Hope this helps. On Jun 17, 2016 3:41 AM, "mayrstefan" notifications@github.com wrote:

I missed the latest release: so I upgraded to v2.2.26 and ran the backup again. After the backup is finised there are around 10GB left in /tmp/ops_manager

root@pivotal-ops-manager:/tmp/ops_manager# du -hs * 5.1G d20160617-1096-1q9lbqt 5.1G d20160617-1096-f7c9mj

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/pivotalservices/cfops/issues/94#issuecomment-226702173, or mute the thread https://github.com/notifications/unsubscribe/AAqI03hkhyFxygeah_4bakMtMW1HGZI0ks5qMk-OgaJpZM4I3SZI .

mayrstefan commented 8 years ago

I opened a support ticket at Pivotal when I found export installation settings leaves almost the same files behind. We use Ops Manager 1.7.4.0 and there might be an issue with the cleanup on Ops Manager side. We have differents Ops Manager instances and around the last days I could see the following:

a recenty rebooted Ops Manager has files older then 24h but none over 48h: cleanup seems to work
another instance where we cleared the /tmp/ops_manager directory manually has files even older then 48h: did the cleanup job even run? Maybe the cleanup jobs dies after some time. I'll report when we know more

mayrstefan commented 8 years ago

I learned Ops Manager has a service tempest-scheduler. The scheduler runs delayed_job which cleans files older 24h in /tmp/ops_manager. This scheduler died somehow. That's why our tmp directory filled up. Old files got deleted after I restarted tempest-scheduler-

We use cfops 2.2.26 and the release information

properly cleans up tmp files on elastic runtime backup did not seem be related to that problem. Maybe this is related to the machine running cfops.

So, sorry for the noise

vmware-archive / cfops

cfops backup leaves temp files on Ops Manager #94