aws / aws-codedeploy-agent

Host Agent for AWS CodeDeploy
https://aws.amazon.com/codedeploy
Apache License 2.0
329 stars 187 forks source link

Rebooting the instance during deployment causes an error #113

Open tomyam1 opened 7 years ago

tomyam1 commented 7 years ago

Trying to reboot the instance the instance during deployment causes an error:

Script - scripts/reboot.sh
[stderr]+ reboot
[stderr]
[stderr]Session terminated, terminating shell... ...terminated.

It'd be nice to have a way to tell the CodeDeploy that the instance is going to reboot and it has to wait for it to come up, or is there already a method for doing that?

tangerini commented 7 years ago

Our agent doesn't support rebooting an instance and resuming an in-progress deployment after the reboot. What's your use case for doing this?

tomyam1 commented 7 years ago

The heavy lifting of our deploy process is done using an ansible playbook. Part of that playbook is to update all the system packages. As a result, sometimes after the playbook has finished, there is a need for a reboot.

rohkat-aws commented 6 years ago

@tomyam1 adding it as a feature request for now .

chharish commented 5 years ago

IIS website will mount FSx as a virtual directory FSx needs Active Directory. Active Directory needs a reboot.

hikeeba commented 4 years ago

Adding this would simplify a lot of items that force a reboot during installation.

avineshwar commented 4 years ago

Trying to reboot the instance the instance during deployment causes an error:

Script - scripts/reboot.sh
[stderr]+ reboot
[stderr]
[stderr]Session terminated, terminating shell... ...terminated.

It'd be nice to have a way to tell the CodeDeploy that the instance is going to reboot and it has to wait for it to come up, or is there already a method for doing that?

Encountered a similar issue today. Would be actually useful to have something like this.

afinzel commented 4 years ago

I got hit by this gem today as well. It would be good to have it recover.

andrewlorien commented 4 years ago

My use case : Windows/IIS environment requires IIS, AD, CodeDeploy, CloudWatch, and various other security/monitoring agents to be installed and configured on boot - requiring either one or two reboots (I know, windows ◔◔ ). When our ASG scales up, the deployment begins as soon as the agent registers itself, which may be before AD registration is complete. In this case, deployments fail with no output and no error message - just the step they were at when the reboot happened.
If it's too hard to continue the deployment after a reboot, another option would be to add a message like "codedeploy agent did not respond" so that it's clear the problem was with agent/instance rather than the appspec
script.

philstrong commented 3 years ago

We're looking into this to provide support like this and general patching of instances.

pixeltrix commented 3 years ago

@philstrong this is particularly an issue with the deb package not honouring policy-rc.d (see #44 and #107) during an instance refresh. Our ansible config tries to stop and disable it before the deployment kicks in but it's not always successful - especially on larger instances. It seemed worse today with our first instance refresh after the recent 1.3.2 release - maybe it's booting quicker with the SDK v3?

I get the comments in those issues about wanting to provide a system agnostic option to the installer about not starting on installation but it seems even without that, the postinst script should honour policy-rc.d anyway? Having looked at the postinst script it seems an easy enough change:

  if systemctl >>/dev/null 2>/dev/null; then
    systemctl enable codedeploy-agent
    systemctl start codedeploy-agent
  else
    update-rc.d codedeploy-agent defaults
    service codedeploy-agent start-no-update
  fi

Instead of systemctl and service the block should use deb-systemd-invoke and invoke-rc.d like this:

  if systemctl >>/dev/null 2>/dev/null; then
    deb-systemd-invoke enable codedeploy-agent
    deb-systemd-invoke start codedeploy-agent
  else
    update-rc.d codedeploy-agent defaults
    invoke-rc.d codedeploy-agent start-no-update
  fi

Happy to help test this if you need someone with the appropriate setup.

nicholas78719 commented 2 years ago

Any progress on this?

pixeltrix commented 2 years ago

@nicholas78719 it appears not as yet again I had deployment failures during an instance refresh this morning due to the agent starting the deploy automatically after it was installed.

pixeltrix commented 2 years ago

Yet again I've had a deployment fail this morning during an instance refresh purely because I can't stop the agent quickly enough - I get that you want to implement a platform agnostic solution but even with that the deb package should honour policy-rc.d so I don't understand what the blocker is here?

pixeltrix commented 1 year ago

More instance refresh fails today as the latest focal AMI needs a reboot after installing updates - again why can't the deb packaging honour policy-rc.d? This should be the case irrespective of whether a feature for rebooting the instance during deployment is developed or not.

pixeltrix commented 1 year ago

Again I'm asking why can't the deb packaging honour policy-rc.d - this is unrelated to rebooting the instance with a running agent, just that installing the agent doesn't automatically start it. As it stands this morning there are updates published that require a restart but there is no updated AMI yet:

image

This means that until a new AMI is published, scaling or refreshing instances has a high probability of failure if we try to reboot them after the cloud-init process is complete as it's a race between stopping the agent after it's automatically started and the agent triggering the deploy. The only alternative is to not reboot them after the cloud-init process is complete and then manually reboot them once the deploy is complete.

I'm really at a loss to understand why a tiny change can't be made to the deb packaging to honour the conventions - can you please enlighten me? 🙏🏻

pixeltrix commented 1 year ago

For anyone else struggling with this, here's a set of ansible steps that downloads the deb package, unpacks it, patches it and then repackages it for installation so that it can be installed during cloud-init without automatically starting:

- name: Check if CodeDeploy Agent is installed
  shell: dpkg-query -W -f='${Status}' codedeploy-agent | grep 'install ok installed'
  register: codedeploy_installed
  failed_when: no
  changed_when: no

- name: Fetch CodeDeploy Agent
  command: creates=/tmp/codedeploy-agent_all.deb aws s3 cp --region {{ aws_region }} s3://aws-codedeploy-{{ aws_region }}/latest/codedeploy-agent_all.deb /tmp/codedeploy-agent_all.deb
  when: codedeploy_installed.rc != 0

- name: Remove any existing working directory
  file:
    path: /tmp/codedeploy-agent
    state: absent
  when: codedeploy_installed.rc != 0

- name: Extract CodeDeploy Agent package
  command: dpkg-deb -x /tmp/codedeploy-agent_all.deb /tmp/codedeploy-agent
  when: codedeploy_installed.rc != 0

- name: Extract CodeDeploy Agent control files
  command: dpkg-deb --control /tmp/codedeploy-agent_all.deb /tmp/DEBIAN
  when: codedeploy_installed.rc != 0

- name: Replace systemctl enable line
  lineinfile:
    path: "/tmp/DEBIAN/postinst"
    regexp: '^    systemctl enable codedeploy-agent$'
    line: '    deb-systemd-invoke enable codedeploy-agent'
  when: codedeploy_installed.rc != 0

- name: Replace systemctl start line
  lineinfile:
    path: "/tmp/DEBIAN/postinst"
    regexp: '^    systemctl start codedeploy-agent$'
    line: '    deb-systemd-invoke start codedeploy-agent'
  when: codedeploy_installed.rc != 0

- name: Replace service start line
  lineinfile:
    path: "/tmp/DEBIAN/postinst"
    regexp: '^    service codedeploy-agent start-no-update$'
    line: '    invoke-rc.d codedeploy-agent start-no-update'
  when: codedeploy_installed.rc != 0

- name: Move control files inside package directory
  command: mv /tmp/DEBIAN /tmp/codedeploy-agent
  when: codedeploy_installed.rc != 0

- name: Build new package file
  command: dpkg -b /tmp/codedeploy-agent /tmp/codedeploy-agent-patched_all.deb
  when: codedeploy_installed.rc != 0

- name: Install patched CodeDeploy Agent package
  apt:
    deb: "/tmp/codedeploy-agent-patched_all.deb"
  when: codedeploy_installed.rc != 0

- name: Enable the CodeDeploy Agent
  service: name=codedeploy-agent state=stopped enabled=yes
  when: codedeploy_installed.rc != 0

Our userdata bash script then ends with the following:

# Reboot if required, otherwise start codedeploy-agent
if [[ -e /var/run/reboot-required ]]; then
  reboot
else
  service codedeploy-agent start
fi

This either reboots the instance after which the agent will start automatically or manually start the agent if there's no reboot required.

Hope this is of help to someone.

markconverts commented 11 months ago

I am also looking for a reboot feature.