awslabs / ami-builder-packer

An example of an AMI Builder using CI/CD with AWS CodePipeline, AWS CodeBuild, Hashicorp Packer and Ansible.
Apache License 2.0
461 stars 230 forks source link

Build failing post build_phase #3

Closed cloudpayload closed 6 years ago

cloudpayload commented 6 years ago

Hi,

Trying to run ami-builder-packer CFT in AWS US-WEST2 region.

Kindly assist as code build fails at Post Build steps, I have tried multiple times and error seems to be constant( At the same stage)

Thanks, Saurabh _### [Container] 2018/02/02 22:24:57 Phase complete: BUILD Success: true [Container] 2018/02/02 22:24:57 Phase context status code: Message: [Container] 2018/02/02 22:24:57 Entering phase POST_BUILD [Container] 2018/02/02 22:24:57 Running command egrep "${AWS_REGION}\:\sami-" build.log | cut -d' ' -f2 > ami_id.txt

[Container] 2018/02/02 22:24:57 Running command test -s ami_id.txt || exit 1

[Container] 2018/02/02 22:24:57 Command did not exit successfully test -s ami_id.txt || exit 1 exit status 1 [Container] 2018/02/02 22:24:57 Phase complete: POSTBUILD Success: false

heitorlessa commented 6 years ago

Hi,

That error doesn’t help much, can you post the full Build output?

This looks like the AMI wasn’t successfully built by Packer as it couldn’t find the AMI ID from there. On Fri, 2 Feb 2018 at 15:14, cloudpayload notifications@github.com wrote:

Hi,

Trying to run ami-builder-packer CFT in AWS US-WEST2 region.

Kindly assist as code build fails at Post Build steps, I have tried multiple times and error seems to be constant( At the same stage)

Thanks, Saurabh _### [Container] 2018/02/02 22:24:57 Phase complete: BUILD Success: true [Container] 2018/02/02 22:24:57 Phase context status code: Message: [Container] 2018/02/02 22:24:57 Entering phase POST_BUILD [Container] 2018/02/02 22:24:57 Running command egrep "${AWS_REGION}:\sami-" build.log | cut -d' ' -f2 > ami_id.txt

[Container] 2018/02/02 22:24:57 Running command test -s ami_id.txt || exit 1

[Container] 2018/02/02 22:24:57 Command did not exit successfully test -s ami_id.txt || exit 1 exit status 1 [Container] 2018/02/02 22:24:57 Phase complete: POSTBUILD Success: false

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/awslabs/ami-builder-packer/issues/3, or mute the thread https://github.com/notifications/unsubscribe-auth/ADL4BIq_NEFywTFOYDJ9SFJMvLQpXUkDks5tQ5bRgaJpZM4R354B .

cloudpayload commented 6 years ago

Thanks for replying, Please find below the build output message. One EC2 instance was launched via packer build,I have confirmed that from console.

LogFile.txt

Thanks, Saurabh

heitorlessa commented 6 years ago

Thanks for that Saurabh.

Can you do me a favour and test this in one of the regions we initially battle tested this? N. Virgnia (us-east-1) or Ireland (eu-west-1).

By looking at the build output the only suspect I can see is the Ansible CIS Role however it's hard to figure out why this is the case as it could be a number of factors: An unsupported AMI that this 3rd party role doesn't like, something was changed in the 3rd party role, etc.

My suggestion here would be either to try in another region just to make sure the playbook is still valid otherwise simply disable CIS playbook and use your own Ansible Playbook for further customization.

Hope that helps

On Mon, 5 Feb 2018 at 11:46 cloudpayload notifications@github.com wrote:

Thanks for replying, Please find below the build output message. One EC2 instance was launched via packer build,I have confirmed that from console.

LogFile.txt https://github.com/awslabs/ami-builder-packer/files/1696552/LogFile.txt

Thanks, Saurabh

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/awslabs/ami-builder-packer/issues/3#issuecomment-363199702, or mute the thread https://github.com/notifications/unsubscribe-auth/ADL4BK4cRp2dCohyvTpCn9niuZrOwnTVks5tR1qogaJpZM4R354B .

cloudpayload commented 6 years ago

I have tested at at N. Virgnia (AWS us-east-1) but still getting the same error. LogFile 2.txt

The instance is getting launched and failing at post build after below command. "Command did not exit successfully test -s ami_id.txt || exit 1 exit status 1

Not sure but does ec2 size plays a role here. I am using T2 micro.

Thanks, Saurabh

ajlanghorn commented 6 years ago

@cloudpayload Did you get this working? I saw https://github.com/hashicorp/packer/issues/4623 in the Packer repository, which makes sense to me - in that, Packer's ansible_local provisioner is unable to hold on to a trafficless SSH session for longer than a few seconds, as defined by the EC2 instance's sshd_config. I haven't tested the fix, but the logic passes muster in my mind.

heitorlessa commented 6 years ago

My apologies on the long delay here - I set some time in my calendar to build one project from scratch.

This line dos look suspicious to me though as that does relate to @ajlanghorn link to Packer issue:

 >>>>> AWS AMI Builder - CIS: TASK [anthcourtney.cis-amazon-linux : 3.6.2 - Ensure default deny firewall policy(DROP INPUT)] ***
AWS AMI Builder - CIS: changed: [127.0.0.1] => (item=INPUT)
AWS AMI Builder - CIS: changed: [127.0.0.1] => (item=FORWARD)
==> AWS AMI Builder - CIS: Terminating the source AWS instance...
==> AWS AMI Builder - CIS: Cleaning up any extra volumes...
==> AWS AMI Builder - CIS: No volumes to clean up, skipping
==> AWS AMI Builder - CIS: Deleting temporary security group...
==> AWS AMI Builder - CIS: Deleting temporary keypair...
Build 'AWS AMI Builder - CIS' errored: Error executing Ansible: Non-zero exit status: 2300218

This seems to set a DROP policy in INPUT of which could essentially close an established SSH connection (Packer on CodeBuild Container <==> EC2 SSH instance) which it'd explain the issue.

If you can @cloudpayload, here's what I'd do as I create a project from scratch:

Excerpt of how that playbook.yaml should look like:

---
- hosts: localhost
  connection: local
  gather_facts: true    # gather OS info that is made available for tasks/roles
  become: yes           # majority of CIS tasks require root
  vars:
    # CIS Controls whitepaper:  http://bit.ly/2mGAmUc
    # AWS CIS Whitepaper:       http://bit.ly/2m2Ovrh
    cis_level_1_exclusions:
    # 3.4.2 and 3.4.3 effectively blocks access to all ports to the machine
    ## This can break automation; ignoring it as there are stronger mechanisms than that
    ## Based on issue #3, adding 3.6.2 as it adds a default INPUT DROP policy in ipt
      - 3.4.2 
      - 3.4.3
      - 3.6.2 
    # Cloudwatch Logs will be used instead of Rsyslog/Syslog-ng
    ## Same would be true if any other software that doesn't support Rsyslog/Syslog-ng mechanisms
      - 4.2.1.4
      - 4.2.2.4
      - 4.2.2.5
    # Autofs is no longer installed and we need to ignore it or else will fail
      - 1.1.19
heitorlessa commented 6 years ago

@ajlanghorn and @cloudpayload -- As suspected it was due to 3.6.2 task that was introduced in the latest version of that Ansible Role (CIS) as well as another task that triggered an issue with Packer Ansible Local (5.3.3).

I've just submitted and merged a PR that fixes both of them and builds are now succeeding consistently

You can find more details as to why they do at the link below:

Ansible Role we depend on has added additional CIS checks: https://github.com/anthcourtney/ansible-role-cis-amazon-linux/commit/240c59fba5275bbb7190fe5bcc7431580a970e88