turbot / steampipe-plugin-aws

Use SQL to instantly query AWS resources across regions and accounts. Open source CLI. No DB required.
https://hub.steampipe.io/plugins/turbot/aws
Apache License 2.0
188 stars 98 forks source link

select * from aws_s3_bucket crashes the plugin #1738

Closed daveagill closed 1 year ago

daveagill commented 1 year ago

Describe the bug This query appears to crash the AWS plugin:

select * from aws_s3_bucket

resulting in this error:

Error: rpc error: code = Unavailable desc = error reading from server: EOF (SQLSTATE HV000)

Steampipe version (steampipe -v) Steampipe v0.19.5

Plugin version (steampipe plugin list) +--------------------------------------------+---------+-------------+ | Installed Plugin | Version | Connections | +--------------------------------------------+---------+-------------+ | hub.steampipe.io/plugins/turbot/aws@latest | 0.102.0 | aws | +--------------------------------------------+---------+-------------+

To reproduce 1) I have a bunch of AWS S3 buckets:

2) I have uncommented in aws.spc: ignore_error_codes = ["AccessDenied", "AccessDeniedException", "NotAuthorized", "UnauthorizedOperation", "UnrecognizedClientException", "AuthorizationError"]

3) Issue this query: select * from aws_s3_bucket

4) Some results will be returned (it varies how many) and the following error shows: Error: rpc error: code = Unavailable desc = error reading from server: EOF (SQLSTATE HV000)

See logs below.

Expected behavior Should fully detail all S3 buckets.

Additional context Other similar queries can run successfully:

logs/plugin-2023-05-16.log

2023-05-16 18:16:42.597 UTC [ERROR] steampipe-plugin-aws.plugin: [ERROR] 1684261001128: aws_s3_bucket.getBucketLocation: bucket_name=<redacted s3 bucket 1> clientRegion=us-east-1 api_error="operation error S3: GetBucketLocation, https response error StatusCode: 403, RequestID: <redacted>, HostID: <redacted>, api error AccessDenied: Access Denied"
2023-05-16 18:16:42.601 UTC [ERROR] steampipe-plugin-aws.plugin: [ERROR] 1684261001128: aws_s3_bucket.getBucketLocation: bucket_name=<redacted s3 bucket 2> clientRegion=us-east-1 api_error="operation error S3: GetBucketLocation, https response error StatusCode: 403, RequestID: <redacted>, HostID: <redacted>, api error AccessDenied: Access Denied"
2023-05-16 18:16:42.913 UTC [ERROR] plugin process exited: path=/Users/dgill/.steampipe/plugins/[hub.steampipe.io/plugins/turbot/aws@latest/steampipe-plugin-aws.plugin](http://hub.steampipe.io/plugins/turbot/aws@latest/steampipe-plugin-aws.plugin) pid=23944 error="exit status 2"

logs/database-2023-05-16.log

2023-05-16 18:16:32.339 UTC [23902] LOG:  starting PostgreSQL 14.2 on x86_64-apple-darwin20.6.0, compiled by Apple clang version 12.0.0 (clang-1200.0.32.29), 64-bit
2023-05-16 18:16:32.341 UTC [23902] LOG:  listening on IPv6 address "::1", port 9193
2023-05-16 18:16:32.341 UTC [23902] LOG:  listening on IPv4 address "127.0.0.1", port 9193
2023-05-16 18:16:32.342 UTC [23902] LOG:  listening on Unix socket "/tmp/.s.PGSQL.9193"
2023-05-16 18:16:32.345 UTC [23904] LOG:  database system was shut down at 2023-05-16 18:05:56 UTC
2023-05-16 18:16:32.349 UTC [23902] LOG:  database system is ready to accept connections
2023-05-16 18:16:32.499 UTC [23911] LOG:  connection received: host=127.0.0.1 port=65228
2023-05-16 18:16:32.501 UTC [23911] LOG:  connection authorized: user=root database=postgres
2023-05-16 18:16:32.510 UTC [23911] LOG:  disconnection: session time: 0:00:00.011 user=root database=postgres host=127.0.0.1 port=65228
2023-05-16 18:16:32.543 UTC [23915] LOG:  connection received: host=127.0.0.1 port=65229
2023-05-16 18:16:32.548 UTC [23915] LOG:  connection authorized: user=root database=postgres application_name=steampipe_35b7 SSL enabled (protocol=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384, bits=256)
2023-05-16 18:16:32.556 UTC [23915] LOG:  disconnection: session time: 0:00:00.013 user=root database=postgres host=127.0.0.1 port=65229
2023-05-16 18:16:32.588 UTC [23919] LOG:  connection received: host=127.0.0.1 port=65230
2023-05-16 18:16:32.593 UTC [23919] LOG:  connection authorized: user=root database=steampipe application_name=steampipe_35b7 SSL enabled (protocol=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384, bits=256)
2023-05-16 18:16:32.630 UTC [WARN]  hub: goFdwImportForeignSchema remote 'steampipe_command' local 'steampipe_command'
2023-05-16 18:16:32.633 UTC [23919] LOG:  disconnection: session time: 0:00:00.045 user=root database=steampipe host=127.0.0.1 port=65230
2023-05-16 18:16:32.704 UTC [23925] LOG:  connection received: host=127.0.0.1 port=65231
2023-05-16 18:16:32.704 UTC [23926] LOG:  connection received: host=127.0.0.1 port=65232
2023-05-16 18:16:32.707 UTC [23925] LOG:  connection authorized: user=steampipe database=steampipe application_name=steampipe_35b7 SSL enabled (protocol=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384, bits=256)
2023-05-16 18:16:32.708 UTC [23926] LOG:  connection authorized: user=steampipe database=steampipe application_name=steampipe_35b7 SSL enabled (protocol=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384, bits=256)
2023-05-16 18:16:32.757 UTC [23930] LOG:  connection received: host=127.0.0.1 port=65233
2023-05-16 18:16:32.761 UTC [23930] LOG:  connection authorized: user=root database=steampipe application_name=steampipe_35b7 SSL enabled (protocol=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384, bits=256)
2023-05-16 18:16:32.802 UTC [23934] LOG:  connection received: host=127.0.0.1 port=65234
2023-05-16 18:16:32.806 UTC [23934] LOG:  connection authorized: user=root database=steampipe application_name=steampipe_35b7 SSL enabled (protocol=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384, bits=256)
2023-05-16 18:16:32.891 UTC [23930] LOG:  could not receive data from client: Connection reset by peer
2023-05-16 18:16:32.891 UTC [23934] LOG:  could not receive data from client: Connection reset by peer
2023-05-16 18:16:32.891 UTC [23934] LOG:  disconnection: session time: 0:00:00.088 user=root database=steampipe host=127.0.0.1 port=65234
2023-05-16 18:16:32.891 UTC [23930] LOG:  disconnection: session time: 0:00:00.134 user=root database=steampipe host=127.0.0.1 port=65233
2023-05-16 18:16:32.988 UTC [23938] LOG:  connection received: host=127.0.0.1 port=65235
2023-05-16 18:16:32.993 UTC [23938] LOG:  connection authorized: user=steampipe database=steampipe application_name=steampipe_35b7 SSL enabled (protocol=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384, bits=256)
2023-05-16 18:16:42.914 UTC [WARN]  hub: stream receive error rpc error: code = Unavailable desc = error reading from server: EOF (0xc0000d0840)
2023-05-16 18:16:42.914 UTC [23925] ERROR:  rpc error: code = Unavailable desc = error reading from server: EOF
2023-05-16 18:16:42.914 UTC [23925] STATEMENT:  select * from aws_s3_bucket
2023-05-16 18:16:47.661 UTC [23938] LOG:  disconnection: session time: 0:00:14.673 user=steampipe database=steampipe host=127.0.0.1 port=65235
2023-05-16 18:16:47.664 UTC [23926] LOG:  disconnection: session time: 0:00:14.960 user=steampipe database=steampipe host=127.0.0.1 port=65232
2023-05-16 18:16:47.665 UTC [23925] LOG:  disconnection: session time: 0:00:14.961 user=steampipe database=steampipe host=127.0.0.1 port=65231
2023-05-16 18:16:47.740 UTC [23954] LOG:  connection received: host=127.0.0.1 port=65339
2023-05-16 18:16:47.745 UTC [23954] LOG:  connection authorized: user=root database=steampipe application_name=steampipe_35b7 SSL enabled (protocol=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384, bits=256)
2023-05-16 18:16:47.751 UTC [23954] LOG:  disconnection: session time: 0:00:00.011 user=root database=steampipe host=127.0.0.1 port=65339
2023-05-16 18:16:47.847 UTC [23902] LOG:  received smart shutdown request
2023-05-16 18:16:47.849 UTC [23902] LOG:  background worker "logical replication launcher" (PID 23910) exited with exit code 1
2023-05-16 18:16:47.849 UTC [23905] LOG:  shutting down
2023-05-16 18:16:47.863 UTC [23902] LOG:  database system is shut down
e-gineer commented 1 year ago

Hey @daveagill ... it would be helpful to try and narrow down the source of the crash. Two dimensions to consider:

  1. Which bucket
  2. Which column

From the column point of view, can you please try:

select name from aws_s3_bucket

Then:

select name, region from aws_s3_bucket

And then keep adding columns progressively?

daveagill commented 1 year ago

Hi @e-gineer thanks for your reply.

In trying to pinpoint a column it seems the issue becomes intermittent with a success/fail crossover at around this point:

steampipe query "select  _ctx, policy_std, replication, website_configuration, tags_src, tags, akas, creation_date, bucket_policy_is_public, versioning_enabled, versioning_mfa_delete, block_public_acls, block_public_policy, ignore_public_acls, restrict_public_buckets from aws_s3_bucket"

i.e. it's around those last two columns: ignore_public_acls, restrict_public_buckets

If I query for everything up to and including ignore_public_acls then it will generally succeed, but occasionally fail. If I query for everything up to and including restrict_public_buckets then it will generally fail, but occasionally succeed.

At this point it feels like the problem is some kind of non-determinism / race condition / multi-threading bug. Or perhaps the occasional passes/fails are the result of caching effects.

It isn't the fault of those columns in isolation though, for example if I execute this then it will run quite happily:

select restrict_public_buckets from aws_s3_bucket

Any ideas on how to debug further? Maybe some other diagnostics I can pull out? Like a stacktrace or debug mode.

Cheers

rajlearner17 commented 1 year ago

@daveagill Thanks for sharing this info and your interest in helping to debug.

I could not reproduce this with a small set of buckets (around 120); this may be too small for data set.

Curious to reproduce this particular scenario; can you provide a few more inputs, such as

Thank you!

daveagill commented 1 year ago

@rajlearner17 I have tried a few more things but no breakthroughs yet.

Here are answers to your questions:

That AWS_PROFILE is a standard env-var recognised by the AWS CLI and steampipe seems happy with it too.

I can't think of a way to narrow down whether it is the fault of a single S3 bucket. The problem I have is if I issue a query like select * from aws_s3_bucket where name='something' then I still receive that error anyway regardless of the name. Which is interesting but I don't know what that tells us.

I don't suppose there's a verbose mode or a crash dump somewhere that could help?

e-gineer commented 1 year ago

Thanks @daveagill for the detailed information.

To get more, you can enable logging:

STEAMPIPE_LOG=trace steampipe query "select  _ctx, policy_std, replication, website_configuration, tags_src, tags, akas, creation_date, bucket_policy_is_public, versioning_enabled, versioning_mfa_delete, block_public_acls, block_public_policy, ignore_public_acls, restrict_public_buckets from aws_s3_bucket"

The logs will be in ~/.steampipe/logs/.....

That may provide some more insight into any crashes.

e-gineer commented 1 year ago

I'll note that seeing different order of results and data is normal. In this case, Steampipe is paging through the list of S3 buckets. Then, for each bucket, it's doing a series of sub-API calls to get column data. It's a lot of API calls, and they all return at different times.

As the data for each row is complete, Steampipe will stream the finished row back to the result set, which means they arrive in different order each time.

If you use order by though, Postgres will order the rows at the end of the query as expected. (Without order by, the order of the rows is undefined in SQL.)

If you want to understand the nitty gritty, I describe this process in more detail at https://youtu.be/2BNzIU5SFaw?t=640

bigdatasourav commented 1 year ago

Hey @daveagill, We are closing this issue because we have not heard from you. Please feel free to reopen the issue if you want to share or discuss anything.