aws / aws-codedeploy-agent

Host Agent for AWS CodeDeploy
https://aws.amazon.com/codedeploy
Apache License 2.0
329 stars 187 forks source link

VPC Endpoint Codedeploy Issues #272

Closed SIVA451 closed 3 years ago

SIVA451 commented 4 years ago

2020-10-15T03:36:34 ERROR [codedeploy-agent(5728)]: InstanceAgent::Plugins::CodeDeployPlugin::CommandPoller: Cannot reach InstanceService: Aws::CodeDeployCommand::Errors::UnknownOperationException - 2020-10-15T03:36:34 DEBUG [codedeploy-agent(5728)]: InstanceAgent::Plugins::CodeDeployPlugin::CommandPoller: Sleeping 16 seconds. 2020-10-15T03:36:51 DEBUG [codedeploy-agent(5728)]: InstanceAgent::Plugins::CodeDeployPlugin::CommandPoller: Calling PollHostCommand: 2020-10-15T03:36:51 INFO [codedeploy-agent(5728)]: Version file found in C:/ProgramData/Amazon/CodeDeploy/.version with agent version OFFICIAL_1.2.1.1868_msi. 2020-10-15T03:36:51 INFO [codedeploy-agent(5728)]: [Aws::CodeDeployCommand::Client 400 0.06269 0 retries] poll_host_command(host_identifier:"arn:aws:ec2:us-west-2:099571392609:instance/i-090xxxxxxxx") Aws::CodeDeployCommand::Errors::UnknownOperationException

We created the VPC endpoint, thereafter it started reporting the above error for every new deployment.

maverick-leo commented 4 years ago

I also faced the same issue, even after enabled the :enable_auth_policy: true in configruation file of windows.

2020-10-14T16:39:52 DEBUG [codedeploy-agent(4340)]: InstanceAgent::Plugins::CodeDeployPlugin::CommandPoller: Configuring deploy control client: Region = "us-west-2" 2020-10-14T16:39:52 DEBUG [codedeploy-agent(4340)]: InstanceAgent::Plugins::CodeDeployPlugin::CommandPoller: Deploy control endpoint override = nil 2020-10-14T16:39:52 INFO [codedeploy-agent(4340)]: InstanceAgent::Plugins::CodeDeployPlugin::CommandExecutor: Archives to retain is: 5} 2020-10-14T16:39:52 DEBUG [codedeploy-agent(4340)]: InstanceAgent::Plugins::CodeDeployPlugin::CommandPoller: Initializing Host Agent: Host Identifier = arn:aws:ec2:us-west-2:08384039addad:instance/i-061e1fxxxxxxexx 2020-10-14T16:39:52 DEBUG [codedeploy-agent(4340)]: InstanceAgent::Plugins::CodeDeployPlugin::CommandPoller: Validating CodeDeploy Plugin Configuration 2020-10-14T16:40:14 ERROR [codedeploy-agent(4340)]: InstanceAgent::Plugins::CodeDeployPlugin::CodeDeployControl: Error during certificate verification on codedeploy endpoint https://codedeploy-commands.us-west-2.amazonaws.com 2020-10-14T16:40:14 DEBUG [codedeploy-agent(4340)]: InstanceAgent::Plugins::CodeDeployPlugin::CodeDeployControl: #<Errno::ETIMEDOUT: Failed to open TCP connection to codedeploy-commands.us-west-2.amazonaws.com:443 (A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond. - connect(2) for "codedeploy-commands.us-west-2.amazonaws.com" port 443)> 2020-10-14T16:40:14 ERROR [codedeploy-agent(4340)]: Error validating the SSL configuration: Invalid server certificate 2020-10-14T16:40:14 INFO [codedeploy-agent(4340)]: CodeDeploy Instance Agent Service: stopping the agent 2020-10-14T16:40:14 ERROR [codedeploy-agent(4340)]: CodeDeploy Instance Agent Service: CodeDeploy Instance Agent Service: error during start or run: NoMethodError - undefined methodgraceful_shutdown' for nil:NilClass - C:/Windows/TEMP/ocrECDB.tmp/src/opt/codedeploy-agent/bin/winagent.rb:60:in block in service_stop' C:/Windows/TEMP/ocrECDB.tmp/src/opt/codedeploy-agent/bin/winagent.rb:59:insynchronize' C:/Windows/TEMP/ocrECDB.tmp/src/opt/codedeploy-agent/bin/winagent.rb:59:in service_stop' C:/Windows/TEMP/ocrECDB.tmp/src/opt/codedeploy-agent/bin/winagent.rb:46:inrescue in block in service_main' C:/Windows/TEMP/ocrECDB.tmp/src/opt/codedeploy-agent/bin/winagent.rb:40:in block in service_main' C:/Windows/TEMP/ocrECDB.tmp/src/opt/codedeploy-agent/bin/winagent.rb:89:inwith_error_handling' C:/Windows/TEMP/ocrECDB.tmp/src/opt/codedeploy-agent/bin/winagent.rb:38:in service_main' C:/Windows/Temp/ocrECDB.tmp/lib/ruby/gems/2.3.0/gems/win32-service-0.8.10/lib/win32/daemon.rb:316:inmainloop' C:/Windows/Temp/ocrECDB.tmp/lib/ruby/gems/2.3.0/gems/win32-service-0.8.10/lib/win32/daemon.rb:214:in mainloop'

SIVA451 commented 4 years ago

Looks like everyone started using the VPC endpoint for code deploy. I know there is a ticket opened in AWS internal tracking system. Let's find the solution by AWS.

dljvette commented 4 years ago

Thank you for reporting the issue here. CodeDeploy is aware of the limitation with using VPCe on Windows and currently working on a related fix as soon as we have everything validated. This effects only Windows instances and unfortunately prevents VPCe from working for those instances until we can publish an update. Will update this with any updates as we have them. - Dan

philstrong commented 3 years ago

1.3.1 will have a fix for VPCe aka PrivateLink on Windows. Working to release it before end of year. Holidays are making it tough, but we're trying to push through. Bear with us 🙏

charlesgardiner commented 3 years ago

For the impatient, Is there a work around for this? Can I change the version of the code deploy agent that is running on the aws instances and get to a successful deploy when running inside of a VPC?

philstrong commented 3 years ago

1.3.1 is out in most regions, so you should be able to update now. We usually wait until every region has it to update GitHub

charlesgardiner commented 3 years ago

I appreciate your quick reply. . . .

I updated CodeDeploy version to 1.3.1. I am running on a windows 2012 instance. The following error is in the log

2020-12-30T22:50:33 ERROR [codedeploy-agent(2192)]: InstanceAgent::Plugins::CodeDeployPlugin::CommandPoller: Cannot reach InstanceService: Aws::CodeDeployCommand::Errors::UnknownOperationException - 
2020-12-30T22:50:33 DEBUG [codedeploy-agent(2192)]: InstanceAgent::Plugins::CodeDeployPlugin::CommandPoller: Sleeping 33 seconds.
2020-12-30T22:51:07 DEBUG [codedeploy-agent(2192)]: InstanceAgent::Plugins::CodeDeployPlugin::CommandPoller: Calling PollHostCommand:
2020-12-30T22:51:07 INFO  [codedeploy-agent(2192)]: Version file found in C:/ProgramData/Amazon/CodeDeploy/.version with agent version OFFICIAL_1.3.1.1880_msi.
2020-12-30T22:51:07 INFO  [codedeploy-agent(2192)]: [Aws::CodeDeployCommand::Client 400 0.078072 0 retries] poll_host_command(host_identifier:"arn:aws:ec2:us-west-2:172830521890:instance/i-04xxxxxxxxxxx") Aws::CodeDeployCommand::Errors::UnknownOperationException

It seems I maybe getting a 400 from the poll host command? I realize this might be the wrong forum for such a post. Is there something obvious I am missing?

philstrong commented 3 years ago

For clarity there is still a bug on Windows for VPCe. We're adding integration testing, so we don't break it again. Expect next version of the agent to have a permanent fix for this.

philstrong commented 3 years ago

Issue is fixed in 1.3.2