Closed aidansteele closed 1 week ago
Hello, @aidansteele,
I apologize for any inconvenience caused.
The issue you're experiencing with the aws_ec2_ami_shared
table may be due to the lack of pagination in the API call (pagination wasn't supported earlier). According to the API documentation, it is strongly recommended to use only paginated requests. Unpaginated requests are susceptible to throttling and timeouts..
To address this, I have raised a PR that includes the following enhancements:
owner_ids
and image_ids
to facilitate batch operations.SELECT * FROM aws_ec2_ami_shared WHERE owner_ids = '["123456789089", "938475869606"]'
SELECT * FROM aws_ec2_ami_shared WHERE image_ids = '["ami-003a0d7f585604444", "ami-0038c1e63dd57df14"]'
Please note that the table design has not yet been finalized. In our environment, we have limited resources to test the performance of these changes. It would be greatly appreciated if you could test the PR branch (issue-2259
) and share your observations.
Official blog post query(It should run more smoothly than before):
WITH instances AS (
SELECT
instance_id,
instance_type,
account_id,
tags ->> 'Name' AS instance_name,
_ctx ->> 'connection_name' AS account_name,
instance_state,
region,
image_id
FROM
aws_ec2_instance
)
SELECT DISTINCT
aws_ec2_ami_shared.image_id as image_id,
aws_ec2_ami_shared.owner_id as image_owner_id,
aws_ec2_ami_shared.image_owner_alias as image_owner_name,
instances.instance_name,
instances.account_name,
instances.region,
aws_ec2_ami_shared.name as image_name
FROM
instances
LEFT JOIN aws_ec2_ami_shared ON aws_ec2_ami_shared.image_id=instances.image_id
WHERE aws_ec2_ami_shared.image_owner_alias != 'amazon'
AND aws_ec2_ami_shared.image_owner_alias != 'self'
Taking advantage of batch operation:
with instances as (
select
instance_id,
instance_type,
account_id,
tags ->> 'Name' as instance_name,
_ctx ->> 'connection_name' as account_name,
instance_state,
region,
image_id
from
aws_ec2_instance
),
all_image_ids as (
select
json_agg(image_id)::jsonb as image_ids -- Cast to jsonb
from
instances
),
shared_ami as (
select
s.*
from
aws_ec2_ami_shared as s,
all_image_ids
where s.image_ids = all_image_ids.image_ids
)
select distinct
shared_ami.image_id as image_id,
shared_ami.owner_id as image_owner_id,
shared_ami.image_owner_alias as image_owner_name,
instances.instance_name,
instances.account_name,
instances.region,
shared_ami.name as image_name
from
instances
left join shared_ami on shared_ami.image_id=instances.image_id
where shared_ami.image_owner_alias != 'amazon'
and shared_ami.image_owner_alias != 'self';
Thank you for your understanding and assistance!
Hi @ParthaI, thank you for such a speedy response. My apologies for only getting back to you now.
I attempted to try your PR, but didn't have much luck and I suspect it's a user error on my behalf. This is what I did:
git clone https://github.com/turbot/steampipe-plugin-aws.git
cd steampipe-plugin-aws
git checkout issue-2259
make
I then verified that there was a new binary (based on last modified timestamp) at ~/.steampipe/plugins/hub.steampipe.io/plugins/turbot/aws@latest/steampipe-plugin-aws.plugin
. However, I ran the query and got this error:
> with instances as (
select
instance_id,
instance_type,
account_id,
tags ->> 'Name' as instance_name,
_ctx ->> 'connection_name' as account_name,
instance_state,
region,
image_id
from
aws_ec2_instance
),
all_image_ids as (
select
json_agg(image_id)::jsonb as image_ids -- Cast to jsonb
from
instances
),
shared_ami as (
select
s.*
from
aws_ec2_ami_shared as s,
all_image_ids
where s.image_ids = all_image_ids.image_ids
)
select distinct
shared_ami.image_id as image_id,
shared_ami.owner_id as image_owner_id,
shared_ami.image_owner_alias as image_owner_name,
instances.instance_name,
instances.account_name,
instances.region,
shared_ami.name as image_name
from
instances
left join shared_ami on shared_ami.image_id=instances.image_id
where shared_ami.image_owner_alias != 'amazon'
and shared_ami.image_owner_alias != 'self';
Error: column s.image_ids does not exist (SQLSTATE 42703)
I also tried inspecting the table and didn't see image_ids
column, so I wonder if somehow I am not running the right plugin version?
> .inspect aws_ec2_ami_shared
+-----------------------+--------------------------+----------------------------------------------------------------------------------------------------------------+
| column | type | description |
+-----------------------+--------------------------+----------------------------------------------------------------------------------------------------------------+
| _ctx | jsonb | Steampipe context in JSON form, e.g. connection_name. |
| account_id | text | The AWS Account ID in which the resource is located. |
| akas | jsonb | Array of globally unique identifier strings (also known as) for the resource. |
| architecture | text | The architecture of the image. |
| block_device_mappings | jsonb | Any block device mapping entries. |
| creation_date | timestamp with time zone | The date and time when the image was created. |
| description | text | The description of the AMI that was provided during image creation. |
| ena_support | boolean | Specifies whether enhanced networking with ENA is enabled. |
| hypervisor | text | The hypervisor type of the image. |
| image_id | text | The ID of the AMI. |
| image_location | text | The location of the AMI. |
| image_owner_alias | text | The AWS account alias (for example, amazon, self) or the AWS account ID of the AMI owner. |
| image_type | text | The type of image. |
| imds_support | text | If v2.0, it indicates that IMDSv2 is specified in the AMI. |
| kernel_id | text | The kernel associated with the image, if any. Only applicable for machine images. |
| name | text | The name of the AMI that was provided during image creation. |
| owner_id | text | The AWS account ID of the image owner. |
| partition | text | The AWS partition in which the resource is located (aws, aws-cn, or aws-us-gov). |
| platform | text | This value is set to windows for Windows AMIs; otherwise, it is blank. |
| platform_details | text | The platform details associated with the billing code of the AMI. For more information, see Obtaining Billing |
| | | Information (https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ami-billing-info.html) in the Amazon Elastic |
| | | Compute Cloud User Guide. |
| product_codes | jsonb | Any product codes associated with the AMI. |
| public | boolean | Indicates whether the image has public launch permissions. The value is true if this image has public launch p |
| | | ermissions or false if it has only implicit and explicit launch permissions. |
| ramdisk_id | text | The RAM disk associated with the image, if any. Only applicable for machine images. |
| region | text | The AWS Region in which the resource is located. |
| root_device_name | text | The device name of the root device volume (for example, /dev/sda1). |
| root_device_type | text | The type of root device used by the AMI. The AMI can use an EBS volume or an instance store volume. |
| sriov_net_support | text | Specifies whether enhanced networking with the Intel 82599 Virtual Function interface is enabled. |
| state | text | The current state of the AMI. If the state is available, the image is successfully registered and can be used |
| | | to launch an instance. |
| tags | jsonb | A map of tags for the resource. |
| tags_src | jsonb | A list of tags attached to the AMI. |
| title | text | Title of the resource. |
| usage_operation | text | The operation of the Amazon EC2 instance and the billing code that is associated with the AMI. For the list of |
| | | UsageOperation codes, see Platform Details and [Usage Operation Billing Codes](https://docs.aws.amazon.com/AW |
| | | SEC2/latest/UserGuide/ami-billing-info.html#billing-info) in the Amazon Elastic Compute Cloud User Guide. |
| virtualization_type | text | The type of virtualization of the AMI. |
+-----------------------+--------------------------+----------------------------------------------------------------------------------------------------------------+
Hi @aidansteele,
I apologize for the inconvenience. Upon reviewing the issue, I discovered that our current main branch has a bug, which is causing the plugin not to build correctly locally. To address this, I have raised a PR with the necessary fix. If you rebase the issue-2259
branch with remove-duplicate-columns
, you should no longer encounter any errors.
Here are the steps to rebase:
git checkout issue-2259
git pull origin remove-duplicate-columns
make
Thank you for your understanding and patience!
Hi @aidansteele, FYI, the PR has already been merged into the main
branch. Rebasing the branch issue-2259
with main
should be good now.
Hello @aidansteele, Have you had a chance to look at the above comment again?
Hi @aidansteele
We haven’t heard back from you in a while, so we’re going to close this issue for now. If you’re still experiencing this problem or have additional information to share, please feel free to reopen the issue or create a new one.
Thank you for your understanding!
Describe the bug The Steampipe AWS plugin appears to be making excessive duplicate API calls when running well-formed SQL queries. This is causing queries to run much slower than they should.
Steampipe version (
steampipe -v
) Steampipe v0.23.3Plugin version (
steampipe plugin list
) hub.steampipe.io/plugins/turbot/aws 0.116.0To reproduce I ran the Ensuring AMIs are from trusted sources sample query from this official blog post. In case the blog post gets updated, this is the exact SQL that I ran:
The query ultimately failed (for reasons that I suspect are not Steampipe's fault: it got a
RequestExpired: Request has expired
response from the EC2 API, indicating a clock skew issue), but it made at least 53,899 calls toec2:DescribeImages
as per CloudTrail. A lot of these were duplicates, e.g. see these screenshot of duplicate AMI lookupsExpected behavior I would have expected Steampipe to deduplicate/cache these API calls and (presumably) for the query to be executed much faster.
Additional context This query was executed against an aggregation of 225 AWS accounts and two AWS regions. I am not an SQL expert and I assume that the SQL query in the Steampipe blog is optimal.