aws / aws-codedeploy-agent

Host Agent for AWS CodeDeploy
https://aws.amazon.com/codedeploy
Apache License 2.0
328 stars 187 forks source link

systemd service runs to early #227

Open ronaldfenner opened 4 years ago

ronaldfenner commented 4 years ago

It looks like the systemd service file is running to early such that the CommandPoller process is unable to start. If I log in and restart the service it then starts to work normally. Rebooting the instance, it then stops working and has to be restarted.

Log I'm seeing before it's restarted 2019-10-21 20:54:03 INFO [codedeploy-agent(3866)]: On Premises config file does not exist or not readable 2019-10-21 20:54:03 ERROR [codedeploy-agent(3866)]: booting child: error during start or run: Aws::Errors::MissingRegionError - missing region; use :reg$ /opt/codedeploy-agent/vendor/gems/aws-sdk-core-2.10.104/lib/seahorse/client/base.rb:84:inblock in after_initialize' /opt/codedeploy-agent/vendor/gems/aws-sdk-core-2.10.104/lib/seahorse/client/base.rb:83:in each' /opt/codedeploy-agent/vendor/gems/aws-sdk-core-2.10.104/lib/seahorse/client/base.rb:83:inafter_initialize' /opt/codedeploy-agent/vendor/gems/aws-sdk-core-2.10.104/lib/seahorse/client/base.rb:21:in initialize' /opt/codedeploy-agent/vendor/gems/aws-sdk-core-2.10.104/lib/seahorse/client/base.rb:105:innew' /opt/codedeploy-agent/lib/instance_agent/plugins/codedeploy/codedeploy_control.rb:41:in get_client' /opt/codedeploy-agent/lib/instance_agent/plugins/codedeploy/command_poller.rb:45:ininitialize' /opt/codedeploy-agent/lib/instance_agent/agent/base.rb:10:in new' /opt/codedeploy-agent/lib/instance_agent/agent/base.rb:10:inrunner' /opt/codedeploy-agent/lib/instance_agent/runner/child.rb:32:in block in prepare_run' /opt/codedeploy-agent/lib/instance_agent/runner/child.rb:78:inwith_error_handling' /opt/codedeploy-agent/lib/instance_agent/runner/child.rb:31:in prepare_run' /opt/codedeploy-agent/vendor/gems/process_manager-0.0.13/lib/process_manager/child.rb:64:inblock in prepare_run_with_error_handling' /opt/codedeploy-agent/lib/instance_agent/runner/child.rb:78:in with_error_handling' /opt/codedeploy-agent/vendor/gems/process_manager-0.0.13/lib/process_manager/child.rb:63:inprepare_run_with_error_handling' /opt/codedeploy-agent/vendor/gems/process_manager-0.0.13/lib/process_manager/child.rb:20:in start' /opt/codedeploy-agent/vendor/gems/process_manager-0.0.13/lib/process_manager/master.rb:206:inblock in spawn_child' /opt/codedeploy-agent/vendor/gems/process_manager-0.0.13/lib/process_manager/master.rb:204:in fork' /opt/codedeploy-agent/vendor/gems/process_manager-0.0.13/lib/process_manager/master.rb:204:inspawn_child' /opt/codedeploy-agent/vendor/gems/process_manager-0.0.13/lib/process_manager/master.rb:196:in block in spawn_children' /opt/codedeploy-agent/vendor/gems/process_manager-0.0.13/lib/process_manager/master.rb:195:intimes' /opt/codedeploy-agent/vendor/gems/process_manager-0.0.13/lib/process_manager/master.rb:195:in spawn_children' /opt/codedeploy-agent/vendor/gems/process_manager-0.0.13/lib/process_manager/master.rb:134:instart' /opt/codedeploy-agent/vendor/gems/process_manager-0.0.13/lib/process_manager/master.rb:37:in block in start' /opt/codedeploy-agent/vendor/gems/process_manager-0.0.13/lib/process_manager/master.rb:36:infork' /opt/codedeploy-agent/vendor/gems/process_manager-0.0.13/lib/process_manager/master.rb:36:in start' /opt/codedeploy-agent/bin/../lib/codedeploy-agent.rb:43:inblock (2 levels) in

' /opt/codedeploy-agent/vendor/gems/gli-2.11.0/lib/gli/command_support.rb:126:in call' /opt/codedeploy-agent/vendor/gems/gli-2.11.0/lib/gli/command_support.rb:126:inexecute' /opt/codedeploy-agent/vendor/gems/gli-2.11.0/lib/gli/app_support.rb:284:in block in call_command' /opt/codedeploy-agent/vendor/gems/gli-2.11.0/lib/gli/app_support.rb:297:incall' /opt/codedeploy-agent/vendor/gems/gli-2.11.0/lib/gli/app_support.rb:297:in call_command' /opt/codedeploy-agent/vendor/gems/gli-2.11.0/lib/gli/app_support.rb:79:inrun' /opt/codedeploy-agent/bin/../lib/codedeploy-agent.rb:90:in <main>' 2

If i add these lines to the service file to have it wait till the network is online then it starts properly. Requires=network-online.target After=network-online.target

sumith-cp commented 4 years ago

What is your OS distro and agent version ?

ronaldfenner commented 4 years ago

I'm using Amazon Linux 2 Linux 4.14.146-120.181.amzn2.x86_64 #1 SMP Fri Oct 18 17:01:06 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

Agent version is agent_version: OFFICIAL_1.0-1.1597_rpm

brndnblck commented 4 years ago

@ronaldfenner happy to look into this for you. For clarity and to save time, can you reply here with a short list of steps to reproduce this issue?

ronaldfenner commented 4 years ago

Install the code deploy agent in Amazon Linux 2 and reboot.

The problem as i pointed out above was the codedeploy-agent.service didn't have have any targets it wanted to wait on and this started before the network services were online.

It looks like that @FlorianHeigl in commit 4cbe67d77afe1fd1a9571d819ec6f3661724a507 added an After=network.target to the service file shortly after i reported this.

I haven't updated the code deploy agent in our base image since so not sure if the new service file fixes it.

This is the service file i use that also works. ` [Unit] Description=AWS CodeDeploy Host Agent Requires=network-online.target After=network-online.target

[Service] Type=forking ExecStart=/bin/bash -a -c '[ -f /etc/profile ] && source /etc/profile; /opt/codedeploy-agent/bin/codedeploy-agent start' ExecStop=/opt/codedeploy-agent/bin/codedeploy-agent stop RemainAfterExit=no Restart=on-failure

Uncomment the following line to run the agent as the codedeploy user

Note: The user must first exist on the system

User=codedeploy

[Install] WantedBy=multi-user.target `