bird-house / birdhouse-deploy

Scripts and configurations to deploy the various birds and servers required for a full-fledged production platform
https://birdhouse-deploy.readthedocs.io/en/latest/
Apache License 2.0
4 stars 6 forks source link

:bug: [BUG]: tmp quickly getting filled up #221

Open fmigneault opened 2 years ago

fmigneault commented 2 years ago

Summary

Server instances getting full rapidly of useless stuff.

Details

I noticed that any instance that has been up for a few days eventually gets filled up by /tmp/_MEIxxxxxx/ directories containing pip install packages (zlib, cryptography, cerfifi, etc. .so files).

I am suspecting the culprits to be the following (though I'm not sure):

https://github.com/bird-house/birdhouse-deploy/blob/master/birdhouse/docker-compose.yml#L183 https://github.com/bird-house/birdhouse-deploy/blob/master/birdhouse/docker-compose.yml#L202 https://github.com/bird-house/birdhouse-deploy/blob/master/birdhouse/docker-compose.yml#L216

I'm wondering if anyone else noticed this or has been impacted by it? Every once in a while, I cannot run commands anymore because /tmp is full, and even basic shell commands cannot dump their own temp files.

What is the purpose of mounting those /tmp in these services?

Since all sub-dirs look like duplicates, it seems there is an operation that re-runs an install step with new tmp packages creation each time. After manual cleanup, I've moved from 99.9% to 20% filled volume of 20GB on one server, so they get considerably large relatively fast.

Environment

Information Value
Server/Platform URL n/a
Version Tag/Commit any
Related issues/PR n/a
Related components raven, finch, flyingpigeon (potentially)
Custom configuration n/a

@matprov You can check instance daccs-instance-10141-fp-weaver-test for example.

Concerned Organizations

@matprov @tlvu

tlvu commented 2 years ago

@fmigneault do you happen to have something else running along-side PAVICS in your server? And that something else would be responsible for filling up your /tmp?

All my Vagrant PAVICS boxes have 100G disk only (https://github.com/bird-house/birdhouse-deploy/blob/f57689bd5b5773f3a31a7a6ebf4426dae2bd78ba/Vagrantfile#L11) so something like this would have broken all my VM long ago.

Furthermore all the /tmp/ volume-mount you refered https://github.com/bird-house/birdhouse-deploy/blob/f57689bd5b5773f3a31a7a6ebf4426dae2bd78ba/birdhouse/docker-compose.yml#L183 https://github.com/bird-house/birdhouse-deploy/blob/f57689bd5b5773f3a31a7a6ebf4426dae2bd78ba/birdhouse/docker-compose.yml#L202 https://github.com/bird-house/birdhouse-deploy/blob/f57689bd5b5773f3a31a7a6ebf4426dae2bd78ba/birdhouse/docker-compose.yml#L216 are anonymous data-volume mounts. They do not mount directly the /tmp of the host.

fmigneault commented 2 years ago

Seems to be docker-compose itself that generates them: https://unix.stackexchange.com/questions/548576/what-are-these-tmp-meixxx-directories

So caused by notebooks being rebuilt during deploy?

tlvu commented 2 years ago

Seems to be docker-compose itself that generates them: https://unix.stackexchange.com/questions/548576/what-are-these-tmp-meixxx-directories

Is your docker-compose from a pip install? Our docker-compose are the prebuilt binary type https://github.com/bird-house/birdhouse-deploy/blob/f57689bd5b5773f3a31a7a6ebf4426dae2bd78ba/birdhouse/vagrant-utils/install-docker.sh#L98

Just a wild guess the docker-compose from a pip install wants to update itself?

So caused by notebooks being rebuilt during deploy?

What notebooks rebuild? We do not "build" any notebooks ... we just redownload the latest version from the git repo and copy them.

I would start checking all your cron jobs on that server to see if the schedule of those cron jobs matches the timestamp of those /tmp/_MEIxxx directories, like in https://unix.stackexchange.com/questions/548576/what-are-these-tmp-meixxx-directories

fmigneault commented 2 years ago

@matprov Maybe you could have something in mind? If this is not seen on @tlvu side, it is something related to the VM creation because the rest of the birdhouse-deploy operation should be the same.

fmigneault commented 2 years ago

Is your docker-compose from a pip install? Our docker-compose are the prebuilt binary type

Same on daccs-iac instances.

tlvu commented 2 years ago

@matprov Maybe you could have something in mind? If this is not seen on @tlvu side, it is something related to the VM creation because the rest of the birdhouse-deploy operation should be the same.

FYI, I checked on our production Boreas that has the biggest amount of cron jobs and there are no /tmp/_MEIxxx directories at all.

matprov commented 2 years ago

I'm adding @ahandan who will take a closer look into this.