aws / aws-sdk

Landing page for the AWS SDKs on GitHub
https://aws.amazon.com/tools/
Other
73 stars 16 forks source link

[feature] provide consistent results for `aws ec2 import-image` and provide separate errors to allow triage #808

Closed PaulCharlton closed 1 week ago

PaulCharlton commented 2 months ago

Describe the bug

this works:

aws ec2 import-image --region us-east-1 --role-name VMImportRole002 --disk-containers file://.disks.json --dry-run

An error occurred (DryRunOperation) when calling the ImportImage operation: Request would have succeeded, but DryRun flag is set

and then, this fails:

aws ec2 import-image --region us-east-1 --role-name VMImportRole002 --disk-containers file://.disks.json --dry-run

An error occurred (InvalidParameter) when calling the ImportImage operation: The service role VMImportRole002 provided does not exist or does not have sufficient permissions

Failure after dry run success is inconsistent with existing documentation.

context:

aws ec2 import-image --version
aws-cli/2.17.25 Python/3.11.9 Darwin/23.6.0 source/arm64

Expected Behavior

--dry-run should fail if subsequent call without --dry-run is going to fail

Too much ambiguity in error response An error occurred (InvalidParameter) when calling the ImportImage operation: The service role VMImportRole002 provided does not exist or does not have sufficient permissions

error response should indicate precise nature of error, such as: 1) can not use trust policy due to mis-matched principal 2) missing permission GetObject for S3 bucket access 3) specified role 'vmimportdoes not exist 4) vmimport role does not have appropriate trust relationship with user running the command 5) STS is not enabled for your account in the target region 6)sufficient permissionis an inadequate response. What would be suitable ispermission s3:getObject` is required. 7) ...

In reviewing errors of aws ec2 image-import reporting on various Internet forums, there are literally a dozen root causes which can cause the single error above.

Current Behavior

aws ec2 image-import should work if --dry-run is working [this is what the documentation states]

aws ec2 image-import help shows

       --dry-run | --no-dry-run (boolean)
          Checks whether you have the required permissions for the action,
          without actually making the request, and provides an error response.
          If you have the required permissions, the error response is
          DryRunOperation . Otherwise, it is UnauthorizedOperation .

Reproduction Steps

declare -rx S3_REGION="${S3_REGION:-us-east-1}"
declare -rx S3_ACCOUNT_ID="${S3_ACCOUNT_ID:-}"
declare -rx S3_BUCKET_NAME="${S3_BUCKET_NAME:-rawimages002}"
declare -rx S3_VM_IMPORT_POLICY_NAME="${S3_VM_IMPORT_POLICY_NAME:-VMImportPolicy002}"
declare -rx S3_VM_IMPORT_ROLE="${S3_VM_IMPORT_ROLE:-VMImportRole002}"
  provider_arch='x86_64'
  boot_type='bios'
  export_name='mbr_volume.vmdk'

  aws_create_vm_import_role
  aws_put_vm_import_role_policy

  # bucket already exists
  aws s3 cp \
    --region "${S3_REGION}" \
    ".results/${provider_arch}/${export_name}" \
    "s3://${S3_BUCKET_NAME}/${provider_arch}/${export_name}"

  aws_ec2_import_image "${provider_arch}" "${boot_type}" "${export_name}"
aws_ec2_import_image() {
  local -r provider_arch="${1}"
  local -r boot_type="${2}"
  local -r export_name="${3}"
  aws ec2 import-image \
    --region "${S3_REGION}" \
    --role-name "${S3_VM_IMPORT_ROLE}" \
    --disk-containers "file://"<(aws_disk_containers "${provider_arch}" "${boot_type}" "${export_name}")
}

aws_disk_containers() {
  local -r provider_arch="${1}"
  local -r boot_type="${2}"
  local -r export_name="${3}"
  cat <<CONTAINER_JSON
[ 
  { 
    "Description": "Image for ${provider_arch} with ${boot_type}",
    "Format": "vmdk",
    "UserBucket": {
      "S3Bucket": "${S3_BUCKET_NAME}",
      "S3Key": "${provider_arch}/${export_name}"
    }
  }
]
CONTAINER_JSON
}  
aws_create_vm_import_role() {
  local role_arn
  role_arn=$(aws iam get-role --role-name "${S3_VM_IMPORT_ROLE}" --query 'Role.Arn' --output text 2>/dev/null)
  if [ -z "${role_arn}" ]; then
    aws iam create-role \
      --region "${S3_REGION}" \
      --role-name "${S3_VM_IMPORT_ROLE}" \
      --assume-role-policy-document "$(aws_vm_import_role_trust_policy)"
  fi
}

aws_vm_import_role_trust_policy() {
cat <<TRUST_POLICY
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
       "Principal": {
          "Service": "vmie.amazonaws.com"
       },
       "Action": "sts:AssumeRole",
       "Condition": {
          "StringEquals":{
             "sts:Externalid": "${S3_VM_IMPORT_ROLE}"
          }
       }
    },
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "ec2.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    },
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::${S3_ACCOUNT_ID}:root"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
TRUST_POLICY
}
aws_put_vm_import_role_policy() {
  aws iam put-role-policy \
    --region "${S3_REGION}" \
    --role-name "${S3_VM_IMPORT_ROLE}" \
    --policy-name "${S3_VM_IMPORT_POLICY_NAME}" \
    --policy-document "file://"<(aws_vm_import_role_policy)
} 

aws_vm_import_role_policy() {
  cat <<ROLE_POLICY
{   
   "Version":"2012-10-17",
   "Statement":[
      {
         "Effect": "Allow",
         "Action": [
            "s3:GetBucketLocation",
            "s3:GetObject",
            "s3:ListBucket"
         ],
         "Resource": [
            "arn:aws:s3:::${S3_BUCKET_NAME}",
            "arn:aws:s3:::${S3_BUCKET_NAME}/*"
         ]
      },
      {
         "Effect": "Allow",
         "Action": [
            "s3:GetBucketLocation",
            "s3:GetObject",
            "s3:ListBucket",
            "s3:PutObject",
            "s3:GetBucketAcl"
         ],
         "Resource": [
            "arn:aws:s3:::export-bucket",
            "arn:aws:s3:::export-bucket/*"
         ]
      },
      {     
         "Effect": "Allow",
         "Action": [
            "ec2:ModifySnapshotAttribute",
            "ec2:CopySnapshot",
            "ec2:RegisterImage",
            "ec2:Describe*"
         ],
         "Resource": "*"
      },
      {
        "Effect": "Allow",
        "Action": [
          "kms:CreateGrant",
          "kms:Decrypt",
          "kms:DescribeKey",
          "kms:Encrypt",
          "kms:GenerateDataKey*",
          "kms:ReEncrypt*"
        ],
        "Resource": "*"
      },
      {
        "Effect": "Allow",
        "Action": [
          "license-manager:GetLicenseConfiguration",
          "license-manager:UpdateLicenseSpecificationsForResource",
          "license-manager:ListLicenseSpecificationsForResource"
        ],
        "Resource": "*"
      }
   ]
}
ROLE_POLICY
}

Possible Solution

What would be suitable is permission s3:getObject is required.

error response should indicate precise nature of error, such as: 1) can not use trust policy due to mis-matched principal 2) missing permission GetObject for S3 bucket access 3) specified role 'vmimportdoes not exist 4) vmimport role does not have appropriate trust relationship with user running the command 5) STS is not enabled for your account in the target region 6)sufficient permissionis an inadequate response. What would be suitable ispermission s3:getObject` is required. 7) ...

In reviewing errors of aws ec2 image-import reporting on various Internet forums, there are literally a dozen root causes which can cause the single error above.

ps: Cloud Trace logs are also not showing the specific failed operation.

Additional Information/Context

aws ec2 import-image --version
aws-cli/2.17.25 Python/3.11.9 Darwin/23.6.0 source/arm64

CLI version used

aws-cli/2.17.25 Python/3.11.9 Darwin/23.6.0 source/arm64

Environment details (OS name and version, etc.)

Darwin 14.5

tim-finnigan commented 2 months ago

Thanks for reaching out. The issue you described is with the EC2 ImportImage API / EC2 error codes rather than with the AWS CLI directly. We can reach out to the EC2 team with the request to improve the error messages here. (ref: P149339833). I'll transfer this to our cross-SDK respository for tracking since the issue involves a service API which is used across AWS SDKs in addition to the CLI.

Also there is a related troubleshooting guide: https://docs.aws.amazon.com/vm-import/latest/userguide/vmimport-troubleshooting.html#import-image-errors

image

So there are several possible causes of that error, and the error message could potentially make that clearer.

PaulCharlton commented 2 months ago

Thanks @tim-finnigan. I already went through that troubleshooting guide in detail before I posted here. Something else going on. One big clue is that the role itself has never been accessed, which means that the Invalid parameter is being thrown prior to the adoption of the import role.

PaulCharlton commented 2 months ago

Status Update

1) unresolved 1) more specific errors from SDK call 2) prove that the awscli json payload is correct 2) resolved 1) immediate problem of inability to use image-import -- in the past, the "vmimport" role was auto-provisioned and managed by AWS on first use of the API. This was deprecated in favor of the account owner creating a new role. Even more recently, in addition to the AWS Service being granted an "sts:assumeRole" policy permission, the API caller user must have the "iam:passRole" permission on their account, and the API caller user MUST NOT be "root" account user.

This knowledge regarding "iam:passRole" does not appear to be available in any online triage protocol for "import-image" that I have found, and was discovered by making the Import USER very promiscuous in granting "iam:*" as allowed actions, which made things work, then paring that grant down to the essence of WHICH IAM action caused things to work.

===> better telemetry from server-side failures is still needed.

PaulCharlton commented 2 months ago

to the extent that better telemetry would introduce a breaking change if the HTTP response body is altered, the new payload info could be returned via a new response header field.

PaulCharlton commented 2 months ago

still more useless and ambiguous telemetry. A message which essentially says "upload deleted, invalid image due to missing filesystem components" -- needs to say what it was expecting, and what actually happened, like "/etc/fstab" is missing. or "root volume is missing", or "no partition contains the root volume", or "unable to install grub updates" would be much more informative.

https://docs.aws.amazon.com/vm-import/latest/userguide/what-is-vmimport.html

PaulCharlton commented 2 months ago

this one An error occurred (InvalidParameter) when calling the ImportImage operation: The service role VMImportRole002 provided does not exist or does not have sufficient permissions is also flat-out wrong when the error is that the USER invoking the SDK API does not have the "iam:passRole" action enabled.

PaulCharlton commented 2 months ago

here's another useless message: ClientError: Unknown OS / Missing OS files.

ok, sure ... but which files are missing? Please.

amberkushwaha commented 2 months ago

here's another useless messages clientform but which files are missing in it.

amberkushwaha commented 2 months ago

aws ec2 import-image --region us-east-1 --role-name VMImportRole002 --disk-containers file://.disks.json --dry-run

An error occurred (InvalidParameter) when calling the ImportImage operation: The service role VMImportRole002 provided does not exist or does not have sufficient permissions

The context of the file is still in the middle of the conceptual behaviour and more often the file in it.code of conduct.contact the file remember the dialogue in the file concept of it.

Also its been in the contributions.

aws ec2 import-image --region us-east-1 --role-name VMImportRole002 --disk-containers file://.disks.json --dry-run

An error occurred (InvalidParameter) when calling the ImportImage operation: The service role VMImportRole002 provided does not exist or does not have sufficient permissions.paste drop

amberkushwaha commented 2 months ago

Add a comment in the main box d=systems of the following file in the circuit.paste drop or click to add files is also code of conduct in it for the given time time period and issues in it were also docs and contact management cookies section circuits were prompted for the main portals.contributing guidelines security policy and code of conduct.manage cookies in the file for interuptions.

PaulCharlton commented 2 months ago

@amberkushwaha I do not understand your word-salad -- other than cut/paste from some of the comments above, why mention a "code of conduct" or "circuits" or "main portals" ?

tim-finnigan commented 1 month ago

Checking in — are there any updates on your end? I've reported amberkushwaha for spam, you can ignore their comments.

PaulCharlton commented 1 month ago

@tim-finnigan nothing much new -- noting that GCP has their import functionality as open-source, so much easier to triage.

tim-finnigan commented 1 month ago

Thanks @PaulCharlton for following up. I'm trying to summarize the status of this for the EC2 team, as the issue is with their ImportImage API: https://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_ImportImage.html

Is the request to improve a specific error message here? Can we narrow this down to a reproducible set of steps for improving the documentation or error message?

PaulCharlton commented 1 month ago

@tim-finnigan I can not provide sample code for improvements because there is no current visibility into the implementation of the API -- what is abundantly clear is that perhaps dozens of errors on the server side are conflated into one error code on the client side, leaving the client wondering what do to to fix anything.

tim-finnigan commented 1 week ago

Thanks for following up. We created a backlog item for the EC2 team to improve the user experience with these commands. If you have any further specific details that you want to pass along to the team please let us know.

github-actions[bot] commented 1 week ago

This issue is now closed.

Comments on closed issues are hard for our team to see. If you need more assistance, please either tag a team member or open a new issue that references this one. If you wish to keep having a conversation with other community members under this issue feel free to do so.