aws / aws-codedeploy-agent

Host Agent for AWS CodeDeploy
https://aws.amazon.com/codedeploy
Apache License 2.0
329 stars 187 forks source link

High Memory Consumption #6

Closed laltomar closed 9 years ago

laltomar commented 9 years ago

Hi,

We are testing out the code deployment agent on Centos7, we are seeing memory consumption spike after each deployment. The application revision we are are deploying is about 100MB and is in tar.gz format. The workaround we have been using is to execute the following commands.

/etc/init.d/codedeploy-agent stop; /etc/init.d/codedeploy-agent start;sync && echo 3 > /proc/sys/vm/drop_caches && free -m

Thanks, Larry

suryanarayanan commented 9 years ago

Hi Larry, Thanks for reporting this. We are trying to reproduce the issue. I'll post an update once we know what is going on. Sorry about the delay. Thanks, Surya.

suryanarayanan commented 9 years ago

I tried with a 200 MB bundle and I'm not able to reproduce this in CentOS 7. But I'd like to isolate what causes the spike. I've got a few questions and suggestions,

  1. What is the exact metric in which you observe a spike? Is it memfree or memusable?
  2. Is your memusable keep decreasing when you do more deployments?
  3. I'd like you to remove all the hooks in your appspec and also don't copy any files as part of install step and do a deployment. This should only download your bundle and do nothing else. Do you see the same spike when you do this?
  4. Now same as above but keep your files section in appspec and do a deployment.

3 and 4 would let us know if the problem is with downloading and extracting your tarball or installing files or executing hooks scripts.

Thanks, Surya.

laltomar commented 9 years ago

Hi Surya,

Please see the following for the information you requested above.

  1. CodeDeploy Memory Usage - 10 Deployments DownLoadBundle Only https://gist.github.com/laltomar-cn/330edfb9d9e956d29500
  2. CodeDeploy Memory Usage - 5 Deployments DownLoadBundle and Copy Files (Uses 40% of Memory) https://gist.github.com/laltomar-cn/a02c659629ac6c0f8fe6
  3. CodeDeploy Memory Usage - 3 Deployments DownLoadBundle, Copy Files, and Hooks (3rd deployment fails due to OOM) https://gist.github.com/laltomar-cn/2981f347cd1fc33b1f8c

I am also seeing high iNode usage, it does not look like tmp directories are getting cleaned up. I did submit close to 20 deploys which is not practical. But I am wondering how is /tmp/codedeploy-agent cleaned up?

suryanarayanan commented 9 years ago

What files do you see in /tmp/codedeploy-agent? With the default configuration there shouldn't be any /tmp/codedeploy-agent. Here are the files the agent keeps (along with their rotation policy),

  1. File /tmp/codedeploy-agent.update.log (Rotates at a max file size of 2MB) This file is used by the agent installer and does not have anything to do with deployments.
  2. Directory /var/log/aws/codedeploy-agent contains the agent run logs. It keeps the past 7 days worth of logs. Rotates everyday at midnight.
  3. Directory /opt/codedeploy-agent/deployment-root contains a single folder for each deployment group you've deployed to in this host and each deployment group folder at most keeps the last 5 deployments details (Including the last deployment with a successful install event). For e.g, 6th deployment will wipe off the 1st deployment folder unless 1st deployment is not the last deployment with a successful install. If so, it will wipe off the 2nd deployment.
  4. /opt/codedeploy-agent/state/.pid should contain a couple of pid files.

These are the only log/metadata files the codedeploy agent should be keeping. Are you noticing any other files accumulating other than these? Also does your bundle contain a lot of small files? What's the expanded size of your bundle?

laltomar commented 9 years ago

Looks like all tmp deployment files are stored in.

/tmp/codedeploy-agent/deployment-instructions /tmp/codedeploy-agent/e48521d4-3a85-456a-b428-f7776187479f

/opt/codedeploy-agent/ does not exist.

I installed codedeploy agent using instructions.

https://rubygems.org/gems/aws-codedeploy-agent

What are the install instructions for rpm?

suryanarayanan commented 9 years ago

Okay. The CodeDeploy agent gem hosted in RubyGems is not yet reviewed and not officially approved by AWS. It is a community contribution. Here's the pull request https://github.com/aws/aws-codedeploy-agent/pull/4

CentOS is not officially supported yet. But if you are just playing around with CodeDeploy you can follow what I did to install the agent in CentOS 7. Though I don't recommend this for production since the updater cannot update the agent.

  1. Pick up the agent rpm from https://s3.amazonaws.com/aws-codedeploy-us-east-1/releases/codedeploy-agent-1.0-1.663.noarch.rpm
  2. In CentOS host run 'sudo yum install ruby' (Make sure it installs ruby2.0)
  3. Do 'ln -s /usr/bin/ruby /usr/bin/ruby2.0'
  4. Run 'rpm -i --nodeps codedeploy-agent-1.0-1.663.noarch.rpm

That should get the agent running. But the updater will fail to update the agent so you'd be missed out on bug fixes (including emergent security patches), new features, etc. So I wouldn't recommend for production use until there's one available officially.

Thanks, Surya.

laltomar commented 9 years ago

Ok, I will install this using the RPM. I will redo the use cases you requested (I.e DownLoadBundle only first, DownLoadBundle and Copy files. I will upload the results sometime this weekend.

We are very interested in using codedeploy in a CentOS production environment. What is the anticipated date that this OS will be supported?

laltomar commented 9 years ago

Hi Surya,

Please see the following for the information you requested above.

  1. Use Case 1 DownLoadBundle Only, 3 Deployments. https://gist.github.com/laltomar-cn/330edfb9d9e956d29500
  2. Use Case 2: DownloadBundle/Copy Files: https://gist.github.com/laltomar-cn/a02c659629ac6c0f8fe6
  3. Use Case 3: DownloadBundle/Copy Files/Hooks, 3 deployments (Deployment Fails OOM). https://gist.github.com/laltomar-cn/2981f347cd1fc33b1f8c
laltomar commented 9 years ago

Hi,

Any update on this?

Thanks, Larry

suryanarayanan commented 9 years ago

Hi Larry, I did dive into this and profiled the codedeploy agent for memory leaks. I tried deployments with a reasonably large gzipped archive (529 Mb deflated, 90 Mb compressed with 20k+ files). I ran into the same issue you reported. I did a few memory optimizations to the agent before collecting the following metrics.

Garbage collector statistics at the end of each deployment https://gist.github.com/suryanarayanan/cd1035afaa01ac5bc78a

Process stats at the end of each deployment (except for the first entry) https://gist.github.com/suryanarayanan/a2adfd2f84485752db80

The process RSS stabilized after multiple runs and I do not see any sharp rise. As you can see the number of active objects in the heap do not considerably increase to indicate any object leaks. One of the reasons for increasing process RSS is the huge number of temporary strings and hashes the agent creates during the Install step if the archive has considerably large number of files. These strings contain file names and directory names present in your archive. These strings are eventually garbage collected but the freed pages are still part of agent's heap. Also partial GC (garbage collection) does not sweep all of these pages at once. I tested this by explicitly doing full GC at the end of every install step and ended up with much lesser process RSS.

I'll try to get the optimizations out in our next agent release. Meanwhile you can purge the clean caches with the way you already do. You can also play around with Ruby 2.0's GC parameters to make the garbage collection more aggressive.

Thanks, Surya.