hashicorp / packer-plugin-amazon

Packer plugin for Amazon AMI Builder
https://www.packer.io/docs/builders/amazon
Mozilla Public License 2.0
72 stars 110 forks source link

amazon_import post-processor fails: architecture=arm64 has an invalid format #264

Open cevich opened 2 years ago

cevich commented 2 years ago

Overview of the Issue

Importing a Fedora 37 arm64 (aarch64) image fails with an error message from AWS indicating the architecture value is invalid. Googling for the error suggests:

Ref: https://github.com/NixOS/nixpkgs/issues/52779#issuecomment-721339829

Reproduction Steps

I'm using a packer (JSON) template with a null builder, shell-local provisioner (download + verify + qemu-img convert the image), and a post-processor similar to what's posted in the template item below. Simply running packer build with this template, will fail a few minutes after the upload completes with:

==> fedora-aws-arm64: Running post-processor: artifice
==> fedora-aws-arm64 (artifice): Using these artifact files: /tmp/fedora-aws-arm64.vhdx
==> fedora-aws-arm64: Running post-processor: amazon-import
    fedora-aws-arm64 (amazon-import): Uploading /tmp/fedora-aws-arm64.vhdx to s3://IMPORT-BUCKET-NAME/packer-import-1662046985.vhdx
    fedora-aws-arm64 (amazon-import): Completed upload of /tmp/fedora-aws-arm64.vhdx to s3://IMPORT-BUCKET-NAME/packer-import-1662046985.vhdx
Build 'fedora-aws-arm64' errored after 3 minutes 11 seconds: 1 error(s) occurred:

* Post-processor failed: Failed to start import from s3://IMPORT-BUCKET-NAME/packer-import-1662046985.vhdx: retry count exhausted. Last err: InvalidParameter: Parameter architecture = arm64 has an invalid format.
        status code: 400, request id: c781da9e-39d2-4df8-ba61-8d176a5cae85

==> Wait completed after 3 minutes 11 seconds

==> Some builds didn't complete successfully and had errors:
--> fedora-aws-arm64: 1 error(s) occurred:

* Post-processor failed: Failed to start import from s3://IMPORT-BUCKET-NAME/packer-import-1662046985.vhdx: retry count exhausted. Last err: InvalidParameter: Parameter architecture = arm64 has an invalid format.
        status code: 400, request id: c781da9e-39d2-4df8-ba61-8d176a5cae85

==> Builds finished but no artifacts were created.

However, passing the (already uploaded) S3 object into the aws CLI results in (eventual) aws ec2 describe-import-snapshot-tasks success:

aws ec2 import-snapshot --disk-container Format=VHDX,UserBucket="{S3Bucket=IMPORT-BUCKET-NAME,S3Key=packer-import-1662046985.vhdx}"

Following that with a aws ec2 register-image --cli-input-json "$(<register-image.json)" reports success with a "ImageId": "ami-BLAHBLAH"

Suggesting https://github.com/hashicorp/packer-plugin-amazon/issues/103 may be a good general solution in lieu of fixing this.

Plugin and Packer version

1.8.0 and 1.8.3

Simplified Packer Buildfile

  ...cut...
  "post-processors": [
    [
      {
        "type": "artifice",
        "keep_input_artifact": false,
        "files": [
          "{{user `TEMPDIR`}}/{{build_name}}.{{user `IMAGE_IMPORT_FORMAT`}}"
        ]
      },
      {
        "type": "amazon-import",
        "region": "SOMEWHERE",
        "s3_bucket_name": "IMPORT-BUCKET-NAME",
        "ami_users": [
          "1234567890"
        ],
        "ami_name": "{{build_name}}-{{user `IMG_SFX`}}",
        "boot_mode": "uefi",
        "format": "{{user `IMAGE_IMPORT_FORMAT`}}",
        "keep_input_artifact": false,
        "tags": {
          "Name": "{{build_name}}-{{user `IMG_SFX`}}",
          "arch": "x86_64",
          "release": "fedora-{{user `FEDORA_RELEASE`}}"
        }
      },
  ...cut...

Operating system and Environment details

CentOS Stream8 container, credentials specified via AWS_SHARED_CREDENTIALS_FILE env. var.

Log Fragments and crash.log files

Eleven-billion lines like:

2022/09/01 16:44:03 packer-post-processor-amazon-import plugin: status code: 400, request id: 8a9f1460-cf21-4d69-a2f6-87eebf913b43 2022/09/01 16:44:29 packer-post-processor-amazon-import plugin: Retryable error: InvalidParameter: Parameter architecture = arm64 has an invalid format.

cevich commented 1 year ago

Ping - is anyone able to take a look at this?

cevich commented 1 year ago

Ping - is anyone able to take a look at this?

cevich commented 1 year ago

Ping - is anyone able to take a look at this?

lbajolet-hashicorp commented 1 year ago

Hi @cevich,

Thanks for bringing this up to our attention. I'm looking at this right now, and it's not super clear how arm64 is being supported for this operation, there are some discrepancies between the API and the tools.

The AWS Go SDK in both v1 and v2 only documents supporting x86 and x86_64 for the ImportImage operation on EC2, which is consistent with the error you're reporting. The API docs for EC2 seem to corroborate what is reported here.

On the other hand, botocore and the AWS cli v2 both report that the API v2016-11-15 supports arm64 as a valid architecture for this call, and I'm not able to understand yet why.

I'll continue digging into this, hopefully we can get this to work in our project too, but this may take some time to figure out how.

I'll keep this ticket updated with my latest findings.

cevich commented 1 year ago

Thanks for looking into this. Yes, I too had problems with the docs. Ultimately I got my needs met by going the import-snapshot and register-image route. However the AWS import service is notoriously unreliable. Sometimes it times out after doing nothing, sometimes it reports the image is bad, then succeeds on a second attempt with the same image. It's frustrating, but if I just keep smashing my "retry" button, eventually it spits an AMI out :confused:

lbajolet-hashicorp commented 1 year ago

Yes, I figured out after re-reading your first message and the related issue that import-snapshot + register-image would be a good fix for this problem. I considered superseding import-image by this route only, but if the service is unreliable I may just leave the choice up to the users, as initially proposed in the PR opened by the author of the other issue.

Anyway, I will likely open a PR soon for this, building on his work, I still need to iron out some testing as the post-processor is rather barren for now, and I would like to have at least a few safeguards to make sure I don't break everything when we merge this.

I'll keep this issue updated with my progress, when the PR is available if you're up to the task, I would very much like you to test it out, see if we have a working solution for your problem!

Thanks again for opening this and letting us know of the issue

cevich commented 1 year ago

Sure I'll gladly help test since it means moving off of a horrible manual hack/workaround-process I put in place. I've got a pile of warm, sharp, scalpels ready to slice my workaround out :rofl:

anish commented 1 year ago

@lbajolet-hashicorp I'm happy to take a look at this if you're busy . We implemented a temporary work around to do a snapshot import using terraform but have been working on arm support in packer as a long term fix

https://github.com/hashicorp/packer-plugin-qemu/pull/118 https://github.com/hashicorp/packer-plugin-googlecompute/pull/147

FenderDOOD commented 3 months ago

I think I figured this out.... Previously if you did not include the 'platform' parameter in your JSON file, the value 'linux' was assumed. That seems to no longer be the default behavior.

In your amazon inport block make sure you include a line that says: "platform": "linux",

  ...cut...
  "post-processors": [
    [
      {
        "type": "artifice",
        "keep_input_artifact": false,
        "files": [
          "{{user `TEMPDIR`}}/{{build_name}}.{{user `IMAGE_IMPORT_FORMAT`}}"
        ]
      },
      {
        "type": "amazon-import",
        "region": "SOMEWHERE",
        "s3_bucket_name": "IMPORT-BUCKET-NAME",
        "ami_users": [
          "1234567890"
        ],
        "ami_name": "{{build_name}}-{{user `IMG_SFX`}}",
        "boot_mode": "uefi",
        "format": "{{user `IMAGE_IMPORT_FORMAT`}}",
        "keep_input_artifact": false,
        "platform": "linux",
        "tags": {
          "Name": "{{build_name}}-{{user `IMG_SFX`}}",
          "arch": "x86_64",
          "release": "fedora-{{user `FEDORA_RELEASE`}}"
        }
      },
  ...cut...

Or "platform": "windows" if that is the case,

Hope that helps.

cevich commented 3 months ago

I think I figured this out.... Previously if you did not include the 'platform' parameter ... Or "platform": "windows" if that is the case,

Hope that helps.

Oh! This is really fantastic news and will (hopefully) let me simplify a complex workaround. I'll give your suggestion a try and close this if it fixes it.